Layers
Neural network layers (Linear, Conv2D, etc.)
Submodule Overview
-
nabla.nn.layers.linear_forward(x, weight, bias=None)[source]
Forward pass through a linear layer.
Computes: output = x @ weight + bias
- Parameters:
x (Array) – Input tensor of shape (batch_size, in_features)
weight (Array) – Weight tensor of shape (in_features, out_features)
bias (Array | None) – Optional bias tensor of shape (1, out_features) or (out_features,)
- Returns:
Output tensor of shape (batch_size, out_features)
- Return type:
Array
-
nabla.nn.layers.mlp_forward(x, params)[source]
MLP forward pass through all layers.
This is the original MLP forward function from mlp_train_jit.py.
Applies ReLU activation to all layers except the last.
- Parameters:
x (Array) – Input tensor of shape (batch_size, input_dim)
params (list[Array]) – List of parameters [W1, b1, W2, b2, …, Wn, bn]
- Returns:
Output tensor of shape (batch_size, output_dim)
- Return type:
Array
-
nabla.nn.layers.mlp_forward_with_activations(x, params, activation='relu', final_activation=None)[source]
MLP forward pass with configurable activations.
- Parameters:
x (Array) – Input tensor of shape (batch_size, input_dim)
params (list[Array]) – List of parameters [W1, b1, W2, b2, …, Wn, bn]
activation (str) – Activation function for hidden layers (“relu”, “tanh”, “sigmoid”)
final_activation (str | None) – Optional activation for final layer
- Returns:
Output tensor of shape (batch_size, output_dim)
- Return type:
Array
-
nabla.nn.layers.relu(x)[source]
Rectified Linear Unit activation function.
- Parameters:
x (Array) – Input array
- Returns:
Array with ReLU applied element-wise
- Return type:
Array
-
nabla.nn.layers.leaky_relu(x, negative_slope=0.01)[source]
Leaky ReLU activation function.
- Parameters:
-
- Returns:
Array with Leaky ReLU applied element-wise
- Return type:
Array
-
nabla.nn.layers.sigmoid(x)[source]
Sigmoid activation function.
- Parameters:
x (Array) – Input array
- Returns:
Array with sigmoid applied element-wise
- Return type:
Array
-
nabla.nn.layers.tanh(x)[source]
Hyperbolic tangent activation function.
- Parameters:
x (Array) – Input array
- Returns:
Array with tanh applied element-wise
- Return type:
Array
-
nabla.nn.layers.gelu(x)[source]
Gaussian Error Linear Unit activation function.
GELU(x) = x * Φ(x) where Φ(x) is the CDF of standard normal distribution.
Approximation: GELU(x) ≈ 0.5 * x * (1 + tanh(√(2/π) * (x + 0.044715 * x^3)))
- Parameters:
x (Array) – Input array
- Returns:
Array with GELU applied element-wise
- Return type:
Array
-
nabla.nn.layers.swish(x, beta=1.0)[source]
Swish (SiLU) activation function.
Swish(x) = x * sigmoid(β * x)
When β = 1, this is SiLU (Sigmoid Linear Unit).
- Parameters:
-
- Returns:
Array with Swish applied element-wise
- Return type:
Array
-
nabla.nn.layers.silu(x)[source]
Sigmoid Linear Unit (SiLU) activation function.
SiLU(x) = x * sigmoid(x) = Swish(x, β=1)
- Parameters:
x (Array) – Input array
- Returns:
Array with SiLU applied element-wise
- Return type:
Array
-
nabla.nn.layers.softmax(x, axis=-1)[source]
Softmax activation function.
- Parameters:
-
- Returns:
Array with softmax applied along specified axis
- Return type:
Array
-
nabla.nn.layers.log_softmax(x, axis=-1)[source]
Log-softmax activation function.
- Parameters:
-
- Returns:
Array with log-softmax applied along specified axis
- Return type:
Array
-
nabla.nn.layers.get_activation(name)[source]
Get activation function by name.
- Parameters:
name (str) – Name of the activation function
- Returns:
Activation function
- Raises:
ValueError – If activation function is not found