Commit History

Autor SHA1 Mensaxe Data
  Tri Dao abbc131173 [LayerNorm] Switch from CUDA to Triton implementation hai 11 meses
  Tri Dao 393882bc08 [LayerNorm] Implement LN with parallel residual, support dim 8k hai 1 ano
  Tri Dao 6738d9477d [LayerNorm] Implement RMS Norm hai 1 ano
  Tri Dao 8c6609ae1a [LayerNorm] Support all dimensions up to 6k (if divisible by 8) %!s(int64=2) %!d(string=hai) anos
  Tri Dao 0bf5e50038 Release training code %!s(int64=2) %!d(string=hai) anos
  Tri Dao 43ab0b5205 Mention that some CUDA extensions have only been tested on A100s %!s(int64=2) %!d(string=hai) anos
  Tri Dao 2e33fc8e36 Add GPT and ViT models %!s(int64=2) %!d(string=hai) anos
  Tri Dao fa6d1ce44f Add fused_dense and dropout_add_layernorm CUDA extensions %!s(int64=2) %!d(string=hai) anos