Layers#

Neural network layers (Linear, Conv2D, etc.)

Submodule Overview#

nabla.nn.layers.linear_forward(x, weight, bias=None)[source]#

Forward pass through a linear layer.

Computes: output = x @ weight + bias

Parameters:
  • x (Array) – Input tensor of shape (batch_size, in_features)

  • weight (Array) – Weight tensor of shape (in_features, out_features)

  • bias (Array | None) – Optional bias tensor of shape (1, out_features) or (out_features,)

Returns:

Output tensor of shape (batch_size, out_features)

Return type:

Array

nabla.nn.layers.mlp_forward(x, params)[source]#

MLP forward pass through all layers.

This is the original MLP forward function from mlp_train_jit.py. Applies ReLU activation to all layers except the last.

Parameters:
  • x (Array) – Input tensor of shape (batch_size, input_dim)

  • params (list[Array]) – List of parameters [W1, b1, W2, b2, …, Wn, bn]

Returns:

Output tensor of shape (batch_size, output_dim)

Return type:

Array

nabla.nn.layers.mlp_forward_with_activations(x, params, activation='relu', final_activation=None)[source]#

MLP forward pass with configurable activations.

Parameters:
  • x (Array) – Input tensor of shape (batch_size, input_dim)

  • params (list[Array]) – List of parameters [W1, b1, W2, b2, …, Wn, bn]

  • activation (str) – Activation function for hidden layers (“relu”, “tanh”, “sigmoid”)

  • final_activation (str | None) – Optional activation for final layer

Returns:

Output tensor of shape (batch_size, output_dim)

Return type:

Array

nabla.nn.layers.relu(x)[source]#

Rectified Linear Unit activation function.

Parameters:

x (Array) – Input array

Returns:

Array with ReLU applied element-wise

Return type:

Array

nabla.nn.layers.leaky_relu(x, negative_slope=0.01)[source]#

Leaky ReLU activation function.

Parameters:
  • x (Array) – Input array

  • negative_slope (float) – Slope for negative values

Returns:

Array with Leaky ReLU applied element-wise

Return type:

Array

nabla.nn.layers.sigmoid(x)[source]#

Sigmoid activation function.

Parameters:

x (Array) – Input array

Returns:

Array with sigmoid applied element-wise

Return type:

Array

nabla.nn.layers.tanh(x)[source]#

Hyperbolic tangent activation function.

Parameters:

x (Array) – Input array

Returns:

Array with tanh applied element-wise

Return type:

Array

nabla.nn.layers.gelu(x)[source]#

Gaussian Error Linear Unit activation function.

GELU(x) = x * Φ(x) where Φ(x) is the CDF of standard normal distribution. Approximation: GELU(x) ≈ 0.5 * x * (1 + tanh(√(2/π) * (x + 0.044715 * x^3)))

Parameters:

x (Array) – Input array

Returns:

Array with GELU applied element-wise

Return type:

Array

nabla.nn.layers.swish(x, beta=1.0)[source]#

Swish (SiLU) activation function.

Swish(x) = x * sigmoid(β * x) When β = 1, this is SiLU (Sigmoid Linear Unit).

Parameters:
  • x (Array) – Input array

  • beta (float) – Scaling factor for sigmoid

Returns:

Array with Swish applied element-wise

Return type:

Array

nabla.nn.layers.silu(x)[source]#

Sigmoid Linear Unit (SiLU) activation function.

SiLU(x) = x * sigmoid(x) = Swish(x, β=1)

Parameters:

x (Array) – Input array

Returns:

Array with SiLU applied element-wise

Return type:

Array

nabla.nn.layers.softmax(x, axis=-1)[source]#

Softmax activation function.

Parameters:
  • x (Array) – Input array

  • axis (int) – Axis along which to compute softmax

Returns:

Array with softmax applied along specified axis

Return type:

Array

nabla.nn.layers.log_softmax(x, axis=-1)[source]#

Log-softmax activation function.

Parameters:
  • x (Array) – Input array

  • axis (int) – Axis along which to compute log-softmax

Returns:

Array with log-softmax applied along specified axis

Return type:

Array

nabla.nn.layers.get_activation(name)[source]#

Get activation function by name.

Parameters:

name (str) – Name of the activation function

Returns:

Activation function

Raises:

ValueError – If activation function is not found