neuralop.layers.coda_layer.CODALayer
- class neuralop.layers.coda_layer.CODALayer(n_modes, n_heads=1, token_codimension=1, head_codimension=None, codimension_size=None, per_channel_attention=True, permutation_eq=True, norm='instance_norm', temperature=1.0, nonlinear_attention=False, scale=None, resolution_scaling_factor=None, incremental_n_modes=None, non_linearity=<built-in function gelu>, use_channel_mlp=True, channel_mlp_expansion=1.0, fno_skip='linear', channel_mlp_skip='linear', preactivation=False, separable=False, factorization='tucker', rank=1.0, joint_factorization=False, conv_module=<class 'neuralop.layers.spectral_convolution.SpectralConv'>, fixed_rank_modes=False, implementation='factorized', decomposition_kwargs=None)[source]
Co-domain Attention Blocks (CODALayer)
It implements the transformer architecture in the operator learning framework, as described in [Re703e87ec801-1].
- Parameters:
- n_modeslist
Number of modes for each dimension used in K, Q, V operator.
- n_headsint, optional
Number of heads for the attention mechanism, by default 1
- token_codimensionint, optional
Co-dimension of each variable, i.e. number of output channels associated with each variable, by default 1
- head_codimensionint, optional
Co-dimension of each output token for each head, by default None
- codimension_sizeint, optional
Size of the codimension for the whole function. Only used for permutation_eq = False, by default None
- per_channel_attentionbool, optional
Whether to use per-channel attention. Default is True (overwrites token_codimension to 1), by default True
- permutation_eqbool, optional
Whether to use permutation equivariant mixer layer after the attention mechanism, by default True
- normliteral {‘instance_norm’} or None, optional
Normalization module to be used. Options: “instance_norm”, None. If ‘instance_norm’, instance normalization is applied to the token outputs of the attention module, by default “instance_norm”
- temperaturefloat, optional
Temperature parameter for the attention mechanism, by default 1.0
- nonlinear_attentionbool, optional
Whether to use non-linear activation for K, Q, V operator, by default False
- scaleint, optional
Scale for downsampling Q, K functions before calculating the attention matrix. Higher scale will downsample more, by default None
- resolution_scaling_factorfloat, optional
Scaling factor for the output, by default None
Methods
compute_attention(tokens, batch_size)Compute the key-query-value variant of the attention matrix for input token functions.
forward(x[, output_shape])CoDANO's forward pass.
- Other Parameters:
- incremental_n_modeslist, optional
Incremental number of modes for each dimension (for incremental training), by default None
- use_channel_mlpbool, optional
Whether to use MLP layers to parameterize skip connections, by default True
- channel_mlp_expansionfloat, optional
Expansion parameter for self.channel_mlp, by default 1.0
- non_linearitycallable, optional
Non-linearity function to be used. Options: F.gelu, F.relu, F.leaky_relu, F.silu, F.tanh, by default F.gelu
- preactivationbool, optional
Whether to use preactivation, by default False
- fno_skipstr, optional
Type of skip connection to be used. Options: “linear”, “soft-gating”, “identity”, by default ‘linear’
- channel_mlp_skipstr, optional
Module to use for ChannelMLP skip connections. Options: “linear”, “soft-gating”, “identity”, by default ‘linear’
- separablebool, optional
Whether to use separable convolutions, by default False
- factorizationstr, optional
Type of factorization to be used. Options: “tucker”, “cp”, “tt”, None, by default ‘tucker’
- rankfloat, optional
Rank of the factorization, by default 1.0
- conv_modulecallable, optional
Spectral convolution module to be used, by default SpectralConv
- joint_factorizationbool, optional
Whether to factorize all spectralConv weights as one tensor, by default False
References
- compute_attention(tokens, batch_size)[source]
Compute the key-query-value variant of the attention matrix for input token functions.
- Parameters:
- tokenstorch.Tensor
Input tokens with shape (b * t, d, h, w, …), where: b is the batch size, t is the number of tokens, d is the token codimension, and h, w, … are the domain dimensions. Assumes input tokens have been normalized.
- batch_sizeint
The size of the batch.
- forward(x, output_shape=None)[source]
CoDANO’s forward pass.
If
self.permutation_eq == True, computes the permutation-equivariant forward pass, where the mixer FNO block is applied to each token separately, making the final result equivariant to any permutation of tokens.If
self.permutation_eq == True, the mixer is applied to the whole function together, and tokens are treated as channels within the same function.
- Parameters:
- xtorch.Tensor
Input tensor with shape (b, t * d, h, w, …), where b is the batch size, t is the number of tokens, and d is the token codimension.