Commitin historia

Tekijä SHA1 Viesti Päivämäärä
  Tri Dao f1a73d0740 Run isort and black on python files 1 vuosi sitten
  Tri Dao 5d079fdd7a [Triton] Fix benchmark_causal, mention Triton version 1 vuosi sitten
  Tri Dao 6b5f271c6d [Triton] Avoid einops repeat by using Tensor.expand 2 vuotta sitten
  Tri Dao b8ccd20098 [Triton] Fix variable name from qkv to kv (h/t FrankZijlstra) 2 vuotta sitten
  Tri Dao 908a5b2244 Set num_warps=4 for headdim=64 in Triton fw (h/t Michael Benesty) 2 vuotta sitten
  Tri Dao 7479757191 Fix pipelining bug in Triton bwd with bias_type=matrix 2 vuotta sitten
  Tri Dao 557781933d Parallelize CUDA bwd along seqlen_k instead of seqlen_q 2 vuotta sitten
  Tri Dao 62025e1aff Fix more race condition in Triton bwd when there's bias 2 vuotta sitten
  Tri Dao ff78ea4123 Fix race condition in Triton bwd when there's bias 2 vuotta sitten
  Tri Dao 86862cfd7b Implement attention bias for Triton version 2 vuotta sitten
  Tri Dao 470010f59b Fix race condition for Triton bwd for headdim 48 and 96 2 vuotta sitten
  Tri Dao aacc10fbab Fix race condition in Triton bwd for non-po2 headdims 2 vuotta sitten
  Tri Dao 1fb12afdfb Avoid memcpy in the Triton bwd 2 vuotta sitten
  Tri Dao 731f154de3 Fix race conditions in the Triton bwd for headdim=64 2 vuotta sitten
  Tri Dao 9b0bc97872 Fix race condition in Triton fwd 2 vuotta sitten
  Tri Dao 215930bce3 Fix EVEN_M & EVEN_HEADDIM for headdim=40 in Triton bwd 2 vuotta sitten
  Tri Dao 4f81aff46e Add debug_barrier for all headdims in Triton bwd 2 vuotta sitten
  Tri Dao bedcbd6a71 Disable some autotune configs that give wrong results in Triton bwd 2 vuotta sitten
  Tri Dao e78d509c64 [WIP] Support all head dimensions up to 128 in the Triton bwd 2 vuotta sitten
  Tri Dao 008951f1d9 Support all head dimensions up to 128 in the Triton fwd 2 vuotta sitten
  Tri Dao b910bf14c1 Support arbitrary seqlens (both q & k) in Triton bwd 2 vuotta sitten
  Tri Dao dc55469355 Support arbitrary seqlen_k in Triton bwd 2 vuotta sitten
  Tri Dao d11341fd1a Fix Triton fwd to support seqlen not multiples of 128 2 vuotta sitten
  Tri Dao b0c0db81f6 Implement FlashAttention in Triton 2 vuotta sitten