LoRA

LoRA#

def init_lora_adapter(weight: 'Tensor', rank: 'int', init_std: 'float' = 0.01, dtype: 'DType | None' = None) -> 'dict[str, Tensor]':

Initialise LoRA adapter matrices A and B for a 2D weight.

Following Hu et al. (2021), A is initialised with Gaussian noise and B is zero-initialised so the adapter adds zero at the start of training.

Parameters

weight – The frozen 2D weight tensor to adapt. Shape (in, out).
rank – Intrinsic rank of the low-rank decomposition. Must be > 0.
init_std – Standard deviation for initialising A. Default: 0.01.
dtype, default: *weight*'s dtype – Optional dtype override. Defaults to weight’s dtype.

Returns

```{‘A’` – Tensor(in, rank), ‘B’: Tensor(rank, out)}``

def lora_delta(adapter: 'dict[str, Tensor]', alpha: 'float' = 1.0) -> 'Tensor':

Compute the scaled LoRA weight update: (alpha / rank) * A @ B.

Parameters

Returns

Delta tensor of shape (in, out).

def lora_linear(x: 'Tensor', frozen_weight: 'Tensor', adapter: 'dict[str, Tensor]', alpha: 'float' = 1.0) -> 'Tensor':

Linear projection with frozen path + LoRA adapter path.

def merge_lora_weight(frozen_weight: 'Tensor', adapter: 'dict[str, Tensor]', alpha: 'float' = 1.0) -> 'Tensor':

Merge the LoRA adapter into the frozen weight: W_merged = W + delta.

Parameters

Returns

Merged weight tensor with the same shape as frozen_weight.

def unmerge_lora_weight(merged_weight: 'Tensor', adapter: 'dict[str, Tensor]', alpha: 'float' = 1.0) -> 'Tensor':

Recover the original frozen weight by subtracting the LoRA delta.

Parameters

Returns

Recovered frozen weight tensor.

def tree_lora_delta(adapters: 'Any', alpha: 'float' = 1.0, *, is_leaf: 'Any' = None) -> 'Any':

Map a pytree of LoRA adapter dicts to their low-rank deltas.