`neuralop.layers.coda_layer`.CODALayer

class neuralop.layers.coda_layer.CODALayer(n_modes, n_heads=1, token_codimension=1, head_codimension=None, codimension_size=None, per_channel_attention=True, permutation_eq=True, norm='instance_norm', temperature=1.0, nonlinear_attention=False, scale=None, resolution_scaling_factor=None, incremental_n_modes=None, non_linearity=<built-in function gelu>, use_channel_mlp=True, channel_mlp_expansion=1.0, fno_skip='linear', channel_mlp_skip='linear', preactivation=False, separable=False, factorization='tucker', rank=1.0, joint_factorization=False, conv_module=<class 'neuralop.layers.spectral_convolution.SpectralConv'>, fixed_rank_modes=False, implementation='factorized', decomposition_kwargs=None)[source]

Co-domain Attention Blocks (CODALayer)

It implements the transformer architecture in the operator learning framework, as described in [Re703e87ec801-1].

Parameters:

n_modeslist: Number of modes for each dimension used in K, Q, V operator.
n_headsint, optional: Number of heads for the attention mechanism, by default 1
token_codimensionint, optional: Co-dimension of each variable, i.e. number of output channels associated with each variable, by default 1
head_codimensionint, optional: Co-dimension of each output token for each head, by default None
codimension_sizeint, optional: Size of the codimension for the whole function. Only used for permutation_eq = False, by default None
per_channel_attentionbool, optional: Whether to use per-channel attention. Default is True (overwrites token_codimension to 1), by default True
permutation_eqbool, optional: Whether to use permutation equivariant mixer layer after the attention mechanism, by default True
normliteral {‘instance_norm’} or None, optional: Normalization module to be used. Options: “instance_norm”, None. If ‘instance_norm’, instance normalization is applied to the token outputs of the attention module, by default “instance_norm”
temperaturefloat, optional: Temperature parameter for the attention mechanism, by default 1.0
nonlinear_attentionbool, optional: Whether to use non-linear activation for K, Q, V operator, by default False
scaleint, optional: Scale for downsampling Q, K functions before calculating the attention matrix. Higher scale will downsample more, by default None
resolution_scaling_factorfloat, optional: Scaling factor for the output, by default None

Methods

`compute_attention`(tokens, batch_size)	Compute the key-query-value variant of the attention matrix for input token functions.
`forward`(x[, output_shape])	CoDANO's forward pass.

Other Parameters:

incremental_n_modeslist, optional: Incremental number of modes for each dimension (for incremental training), by default None
use_channel_mlpbool, optional: Whether to use MLP layers to parameterize skip connections, by default True
channel_mlp_expansionfloat, optional: Expansion parameter for self.channel_mlp, by default 1.0
non_linearitycallable, optional: Non-linearity function to be used. Options: F.gelu, F.relu, F.leaky_relu, F.silu, F.tanh, by default F.gelu
preactivationbool, optional: Whether to use preactivation, by default False
fno_skipstr, optional: Type of skip connection to be used. Options: “linear”, “soft-gating”, “identity”, by default ‘linear’
channel_mlp_skipstr, optional: Module to use for ChannelMLP skip connections. Options: “linear”, “soft-gating”, “identity”, by default ‘linear’
separablebool, optional: Whether to use separable convolutions, by default False
factorizationstr, optional: Type of factorization to be used. Options: “tucker”, “cp”, “tt”, None, by default ‘tucker’
rankfloat, optional: Rank of the factorization, by default 1.0
conv_modulecallable, optional: Spectral convolution module to be used, by default SpectralConv
joint_factorizationbool, optional: Whether to factorize all spectralConv weights as one tensor, by default False

References

compute_attention(tokens, batch_size)[source]

Compute the key-query-value variant of the attention matrix for input token functions.

Parameters:

tokenstorch.Tensor: Input tokens with shape (b * t, d, h, w, …), where: b is the batch size, t is the number of tokens, d is the token codimension, and h, w, … are the domain dimensions. Assumes input tokens have been normalized.
batch_sizeint: The size of the batch.

forward(x, output_shape=None)[source]

CoDANO’s forward pass.

If self.permutation_eq == True, computes the permutation-equivariant forward pass, where the mixer FNO block is applied to each token separately, making the final result equivariant to any permutation of tokens.
If self.permutation_eq == True, the mixer is applied to the whole function together, and tokens are treated as channels within the same function.

Parameters:

xtorch.Tensor: Input tensor with shape (b, t * d, h, w, …), where b is the batch size, t is the number of tokens, and d is the token codimension.