Tri Dao
|
71befc19e1
[Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss
|
há 1 ano atrás |
Tri Dao
|
dff68c2b22
Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss
|
há 2 anos atrás |
Tri Dao
|
0bf5e50038
Release training code
|
há 2 anos atrás |