Commit History

Autor SHA1 Mensaxe Data
  Tri Dao f1a73d0740 Run isort and black on python files hai 1 ano
  Tri Dao d2f4324f4c [LayerNorm] Make sure memory addresses are aligned to 16 bytes hai 1 ano
  Tri Dao 96d10f6545 Implement LLaMa hai 1 ano
  Tri Dao 393882bc08 [LayerNorm] Implement LN with parallel residual, support dim 8k hai 1 ano
  Tri Dao eb33e587e9 [LayerNorm] Rename x1 -> residual hai 1 ano
  Tri Dao 6738d9477d [LayerNorm] Implement RMS Norm hai 1 ano
  Tri Dao 5fb6df0e04 Implement BERT %!s(int64=2) %!d(string=hai) anos
  Tri Dao 5db330519a [LayerNorm] Support taking subset of input or subset of output %!s(int64=2) %!d(string=hai) anos
  Tri Dao ae137ed17a [LayerNorm] Fuse LayerScale %!s(int64=2) %!d(string=hai) anos
  Tri Dao 8c6609ae1a [LayerNorm] Support all dimensions up to 6k (if divisible by 8) %!s(int64=2) %!d(string=hai) anos
  Tri Dao fa6d1ce44f Add fused_dense and dropout_add_layernorm CUDA extensions %!s(int64=2) %!d(string=hai) anos