Init
Parameter initialization strategies
Submodule Overview
-
nabla.nn.init.he_normal(shape, seed=None)[source]
He normal initialization for ReLU networks.
Uses normal distribution with std = sqrt(2/fan_in) which is optimal
for ReLU activations.
- Parameters:
shape (tuple[int, ...]) – Shape of the parameter tensor
seed (int | None) – Random seed for reproducibility
- Returns:
Initialized parameter array
- Return type:
Array
-
nabla.nn.init.he_uniform(shape, seed=None)[source]
He uniform initialization for ReLU networks.
Uses uniform distribution with bound = sqrt(6/fan_in) which is optimal
for ReLU activations.
- Parameters:
shape (tuple[int, ...]) – Shape of the parameter tensor
seed (int | None) – Random seed for reproducibility
- Returns:
Initialized parameter array
- Return type:
Array
-
nabla.nn.init.xavier_normal(shape, seed=None)[source]
Xavier/Glorot normal initialization.
Uses normal distribution with std = sqrt(2/(fan_in + fan_out)) which
is optimal for sigmoid/tanh activations.
- Parameters:
shape (tuple[int, ...]) – Shape of the parameter tensor
seed (int | None) – Random seed for reproducibility
- Returns:
Initialized parameter array
- Return type:
Array
-
nabla.nn.init.xavier_uniform(shape, seed=None)[source]
Xavier/Glorot uniform initialization.
Uses uniform distribution with bound = sqrt(6/(fan_in + fan_out)) which
is optimal for sigmoid/tanh activations.
- Parameters:
shape (tuple[int, ...]) – Shape of the parameter tensor
seed (int | None) – Random seed for reproducibility
- Returns:
Initialized parameter array
- Return type:
Array
-
nabla.nn.init.lecun_normal(shape, seed=None)[source]
LeCun normal initialization.
Uses normal distribution with std = sqrt(1/fan_in) which is optimal
for SELU activations.
- Parameters:
shape (tuple[int, ...]) – Shape of the parameter tensor
seed (int | None) – Random seed for reproducibility
- Returns:
Initialized parameter array
- Return type:
Array
-
nabla.nn.init.initialize_mlp_params(layers, seed=42)[source]
Initialize MLP parameters with specialized strategy for complex functions.
This is the original initialization strategy from mlp_train_jit.py,
optimized for learning high-frequency functions.
- Parameters:
layers (list[int]) – List of layer sizes [input, hidden1, hidden2, …, output]
seed (int) – Random seed for reproducibility
- Returns:
List of parameter arrays [W1, b1, W2, b2, …]
- Return type:
list[Array]