Tri Dao
|
71befc19e1
[Loss] Use flash_attn.losses.cross_entropy.CrossEntropyLoss
|
1 vuosi sitten |
Tri Dao
|
dff68c2b22
Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss
|
2 vuotta sitten |
Tri Dao
|
0bf5e50038
Release training code
|
2 vuotta sitten |