Tri Dao
|
bcd918f275
[LayerNorm] Add option to write result to out and residual_out
|
il y a 4 mois |
Tri Dao
|
bd82d6c6eb
Revert "[LayerNorm] Don't store x + residual if we don't need gradients"
|
il y a 4 mois |
Tri Dao
|
800401847e
[LayerNorm] Don't store x + residual if we don't need gradients
|
il y a 4 mois |
Tri Dao
|
36587c01cb
[LayerNorm] Update layer_norm_linear
|
il y a 9 mois |
Tri Dao
|
bdcae547c7
[LayerNorm] Don't exit early in the backward pass (fix #781)
|
il y a 10 mois |
Tri Dao
|
c9861a032d
[LayerNorm] Initialize mean and rstd tensor using x.device
|
il y a 11 mois |
Tri Dao
|
f5b308e258
[LayerNorm] Rename layernorm.py -> layer_norm.py
|
il y a 11 mois |