Init#

Parameter initialization strategies

Submodule Overview#

nabla.nn.init.he_normal(shape, seed=None)[source]#

He normal initialization for ReLU networks.

Uses normal distribution with std = sqrt(2/fan_in) which is optimal for ReLU activations.

Parameters:
  • shape (tuple[int, ...]) – Shape of the parameter tensor

  • seed (int | None) – Random seed for reproducibility

Returns:

Initialized parameter array

Return type:

Array

nabla.nn.init.he_uniform(shape, seed=None)[source]#

He uniform initialization for ReLU networks.

Uses uniform distribution with bound = sqrt(6/fan_in) which is optimal for ReLU activations.

Parameters:
  • shape (tuple[int, ...]) – Shape of the parameter tensor

  • seed (int | None) – Random seed for reproducibility

Returns:

Initialized parameter array

Return type:

Array

nabla.nn.init.xavier_normal(shape, seed=None)[source]#

Xavier/Glorot normal initialization.

Uses normal distribution with std = sqrt(2/(fan_in + fan_out)) which is optimal for sigmoid/tanh activations.

Parameters:
  • shape (tuple[int, ...]) – Shape of the parameter tensor

  • seed (int | None) – Random seed for reproducibility

Returns:

Initialized parameter array

Return type:

Array

nabla.nn.init.xavier_uniform(shape, seed=None)[source]#

Xavier/Glorot uniform initialization.

Uses uniform distribution with bound = sqrt(6/(fan_in + fan_out)) which is optimal for sigmoid/tanh activations.

Parameters:
  • shape (tuple[int, ...]) – Shape of the parameter tensor

  • seed (int | None) – Random seed for reproducibility

Returns:

Initialized parameter array

Return type:

Array

nabla.nn.init.lecun_normal(shape, seed=None)[source]#

LeCun normal initialization.

Uses normal distribution with std = sqrt(1/fan_in) which is optimal for SELU activations.

Parameters:
  • shape (tuple[int, ...]) – Shape of the parameter tensor

  • seed (int | None) – Random seed for reproducibility

Returns:

Initialized parameter array

Return type:

Array

nabla.nn.init.initialize_mlp_params(layers, seed=42)[source]#

Initialize MLP parameters with specialized strategy for complex functions.

This is the original initialization strategy from mlp_train_jit.py, optimized for learning high-frequency functions.

Parameters:
  • layers (list[int]) – List of layer sizes [input, hidden1, hidden2, …, output]

  • seed (int) – Random seed for reproducibility

Returns:

List of parameter arrays [W1, b1, W2, b2, …]

Return type:

list[Array]