neuralop.layers.coda_layer.CODALayer

class neuralop.layers.coda_layer.CODALayer(n_modes, n_heads=1, token_codimension=1, head_codimension=None, codimension_size=None, per_channel_attention=True, permutation_eq=True, norm='instance_norm', temperature=1.0, nonlinear_attention=False, scale=None, resolution_scaling_factor=None, incremental_n_modes=None, non_linearity=<built-in function gelu>, use_channel_mlp=True, channel_mlp_expansion=1.0, fno_skip='linear', channel_mlp_skip='linear', preactivation=False, separable=False, factorization='tucker', rank=1.0, joint_factorization=False, conv_module=<class 'neuralop.layers.spectral_convolution.SpectralConv'>, fixed_rank_modes=False, implementation='factorized', decomposition_kwargs=None)[source]

Co-domain Attention Blocks (CODALayer)

It implements the transformer architecture in the operator learning framework, as described in [Re703e87ec801-1].

Parameters:
n_modeslist

Number of modes for each dimension used in K, Q, V operator.

n_headsint, optional

Number of heads for the attention mechanism, by default 1

token_codimensionint, optional

Co-dimension of each variable, i.e. number of output channels associated with each variable, by default 1

head_codimensionint, optional

Co-dimension of each output token for each head, by default None

codimension_sizeint, optional

Size of the codimension for the whole function. Only used for permutation_eq = False, by default None

per_channel_attentionbool, optional

Whether to use per-channel attention. Default is True (overwrites token_codimension to 1), by default True

permutation_eqbool, optional

Whether to use permutation equivariant mixer layer after the attention mechanism, by default True

normliteral {‘instance_norm’} or None, optional

Normalization module to be used. Options: “instance_norm”, None. If ‘instance_norm’, instance normalization is applied to the token outputs of the attention module, by default “instance_norm”

temperaturefloat, optional

Temperature parameter for the attention mechanism, by default 1.0

nonlinear_attentionbool, optional

Whether to use non-linear activation for K, Q, V operator, by default False

scaleint, optional

Scale for downsampling Q, K functions before calculating the attention matrix. Higher scale will downsample more, by default None

resolution_scaling_factorfloat, optional

Scaling factor for the output, by default None

Methods

compute_attention(tokens, batch_size)

Compute the key-query-value variant of the attention matrix for input token functions.

forward(x[, output_shape])

CoDANO's forward pass.

Other Parameters:
incremental_n_modeslist, optional

Incremental number of modes for each dimension (for incremental training), by default None

use_channel_mlpbool, optional

Whether to use MLP layers to parameterize skip connections, by default True

channel_mlp_expansionfloat, optional

Expansion parameter for self.channel_mlp, by default 1.0

non_linearitycallable, optional

Non-linearity function to be used. Options: F.gelu, F.relu, F.leaky_relu, F.silu, F.tanh, by default F.gelu

preactivationbool, optional

Whether to use preactivation, by default False

fno_skipstr, optional

Type of skip connection to be used. Options: “linear”, “soft-gating”, “identity”, by default ‘linear’

channel_mlp_skipstr, optional

Module to use for ChannelMLP skip connections. Options: “linear”, “soft-gating”, “identity”, by default ‘linear’

separablebool, optional

Whether to use separable convolutions, by default False

factorizationstr, optional

Type of factorization to be used. Options: “tucker”, “cp”, “tt”, None, by default ‘tucker’

rankfloat, optional

Rank of the factorization, by default 1.0

conv_modulecallable, optional

Spectral convolution module to be used, by default SpectralConv

joint_factorizationbool, optional

Whether to factorize all spectralConv weights as one tensor, by default False

References

compute_attention(tokens, batch_size)[source]

Compute the key-query-value variant of the attention matrix for input token functions.

Parameters:
tokenstorch.Tensor

Input tokens with shape (b * t, d, h, w, …), where: b is the batch size, t is the number of tokens, d is the token codimension, and h, w, … are the domain dimensions. Assumes input tokens have been normalized.

batch_sizeint

The size of the batch.

forward(x, output_shape=None)[source]

CoDANO’s forward pass.

  • If self.permutation_eq == True, computes the permutation-equivariant forward pass, where the mixer FNO block is applied to each token separately, making the final result equivariant to any permutation of tokens.

  • If self.permutation_eq == True, the mixer is applied to the whole function together, and tokens are treated as channels within the same function.

Parameters:
xtorch.Tensor

Input tensor with shape (b, t * d, h, w, …), where b is the batch size, t is the number of tokens, and d is the token codimension.