Module `fast_transformers.builders`

This module implements builders that simplify building complex transformer architectures with different attention mechanisms.

The main idea is to facilitate the construction of various attention layers and transformer encoder layers and simplify their assembly into one transformer module. It also allows for flexibility in the scripts as many builder parameters can correspond 1-1 with command line arguments.

Example usage:

builder = TransformerEncoderBuilder()
builder.n_layers = 12
builder.n_heads = 8
builder.feed_forward_dimensions = 1024
builder.query_dimensions = 64
builder.value_dimensions = 64
builder.dropout = 0.1
builder.attention_dropout = 0.1
builder.attention_type = "linear"
transformer = builder.get()

Expand source code

#
# Copyright (c) 2020 Idiap Research Institute, http://www.idiap.ch/
# Written by Angelos Katharopoulos <angelos.katharopoulos@idiap.ch>,
# Apoorv Vyas <avyas@idiap.ch>
#

"""This module implements builders that simplify building complex transformer
architectures with different attention mechanisms.

The main idea is to facilitate the construction of various attention layers and
transformer encoder layers and simplify their assembly into one transformer
module. It also allows for flexibility in the scripts as many builder
parameters can correspond 1-1 with command line arguments.

Example usage:

    builder = TransformerEncoderBuilder()
    builder.n_layers = 12
    builder.n_heads = 8
    builder.feed_forward_dimensions = 1024
    builder.query_dimensions = 64
    builder.value_dimensions = 64
    builder.dropout = 0.1
    builder.attention_dropout = 0.1
    builder.attention_type = "linear"
    transformer = builder.get()
"""

__all__ = [
    "AttentionBuilder",
    "RecurrentAttentionBuilder",
    "RecurrentCrossAttentionBuilder"
]

# Import the attention implementations so that they register themselves with
# the builder. Attention implementations external to the library should be
# imported before using the builders.
#
# TODO: Should this behaviour change? Namely, should all attention
#       implementations be imported in order to be useable? This also allows
#       using the library even partially built, for instance.
from ..attention import \
    FullAttention, \
    LinearAttention, CausalLinearAttention, \
    ClusteredAttention, ImprovedClusteredAttention, \
    ReformerAttention, \
    ExactTopKAttention, ImprovedClusteredCausalAttention, \
    ConditionalFullAttention
del FullAttention, \
    LinearAttention, CausalLinearAttention, \
    ClusteredAttention, ImprovedClusteredAttention, \
    ReformerAttention, \
    ExactTopKAttention, ImprovedClusteredCausalAttention, \
    ConditionalFullAttention


from .attention_builders import \
    AttentionBuilder, \
    RecurrentAttentionBuilder, \
    RecurrentCrossAttentionBuilder

from .transformer_builders import \
    TransformerEncoderBuilder, \
    RecurrentEncoderBuilder, \
    TransformerDecoderBuilder, \
    RecurrentDecoderBuilder

Sub-modules

fast_transformers.builders.attention_builders
fast_transformers.builders.base: Provide a class for the others to inherit some useful functionality.
fast_transformers.builders.transformer_builders: Build complex transformer architectures for inference or training easily.

Classes

class AttentionBuilder

Build attention implementations for batch sequence processing or training.

Expand source code

class AttentionBuilder(BaseAttentionBuilder):
    """Build attention implementations for batch sequence processing or
    training."""
    def __init__(self):
        super(AttentionBuilder, self).__init__(AttentionRegistry)

Ancestors

Inherited members

BaseAttentionBuilder:
- available_attentions
- from_dictionary
- from_kwargs
- from_namespace
- get
- validate_attention_type

class RecurrentAttentionBuilder

Build attention implementations for autoregressive sequence processing.

Expand source code

class RecurrentAttentionBuilder(BaseAttentionBuilder):
    """Build attention implementations for autoregressive sequence
    processing."""
    def __init__(self):
        super(RecurrentAttentionBuilder, self).__init__(
            RecurrentAttentionRegistry
        )

Ancestors

Inherited members

BaseAttentionBuilder:
- available_attentions
- from_dictionary
- from_kwargs
- from_namespace
- get
- validate_attention_type

class RecurrentCrossAttentionBuilder

Build attention implementations for autoregressive cross attention computation.

Expand source code

class RecurrentCrossAttentionBuilder(BaseAttentionBuilder):
    """Build attention implementations for autoregressive cross attention
    computation."""
    def __init__(self):
        super(RecurrentCrossAttentionBuilder, self).__init__(
            RecurrentCrossAttentionRegistry
        )

Ancestors

Inherited members

BaseAttentionBuilder:
- available_attentions
- from_dictionary
- from_kwargs
- from_namespace
- get
- validate_attention_type