Tri Dao
|
bcd918f275
[LayerNorm] Add option to write result to out and residual_out
|
4 meses atrás |
Tri Dao
|
bd82d6c6eb
Revert "[LayerNorm] Don't store x + residual if we don't need gradients"
|
4 meses atrás |
Tri Dao
|
800401847e
[LayerNorm] Don't store x + residual if we don't need gradients
|
4 meses atrás |
Tri Dao
|
36587c01cb
[LayerNorm] Update layer_norm_linear
|
9 meses atrás |
Tri Dao
|
bdcae547c7
[LayerNorm] Don't exit early in the backward pass (fix #781)
|
10 meses atrás |
Tri Dao
|
c9861a032d
[LayerNorm] Initialize mean and rstd tensor using x.device
|
11 meses atrás |
Tri Dao
|
f5b308e258
[LayerNorm] Rename layernorm.py -> layer_norm.py
|
11 meses atrás |