Commit History

Autor SHA1 Mensaxe Data
  Tri Dao 4ead9bd7cc [FA3] Varlen forward hai 4 meses
  Tri Dao 74b0761ff7 [FA3] BF16 forward hai 5 meses
  Tri Dao 898dd4bbf2 Pass seqused_k to _flash_attn_varlen_forward hai 5 meses
  Tri Dao 7ef24848cf Add FA3 image hai 5 meses
  Tri Dao 7f67966cc7 FA3 initial code release hai 5 meses
  Tri Dao b4a9dd6c9c Temporarily switch to cutlass fork for more shapes hai 5 meses
  Tri Dao 7551202cb2 Bump to v2.6.1 hai 5 meses
  Tri Dao 844912dca0 [CI] Switch from CUDA 12.2 to 12.3 hai 5 meses
  Tri Dao 40e534a7f6 Implement cache_leftpad hai 5 meses
  Tri Dao 116b05f9b0 [CI] Compile with pytorch 2.4.0.dev20240514 hai 5 meses
  Tri Dao da11d1b853 Bump v2.6.0 hai 5 meses
  Tri Dao d0787acc16 Relax dropout_fraction test hai 5 meses
  Tri Dao dca6d89da4 Don't support softcap and dropout at the same time hai 5 meses
  Tri Dao 81e01efd4b More typo fixes hai 5 meses
  Tri Dao 72e27c6320 Fix typo with softcapping hai 5 meses
  Tri Dao 3d41db3e2c Only test backward if there's no softcapping hai 5 meses
  Tri Dao 908511b2b6 Split into more .cu files to speed up compilation hai 5 meses
  Tri Dao 1d536d7de5 Minor cleanup of softcapping hai 5 meses
  Tri Dao beb2bf2a32 Drop support for pytorch 1.12, 1.13, and python 3.7 hai 5 meses
  Phil Wang f4628b43ec missing commas and backwards return arguments (#1032) hai 5 meses
  Nicolas Patry 8f873cc6ac Implement softcapping. (#1025) hai 5 meses
  Jianwei Dong 4e8d60069f Add the return_softmax_lse parameter to the flash_attn_with_kvcache function to allow returning the logsumexp of the attention scores. (#989) hai 5 meses
  muoshuosha 6df7e0a02e Fix the varlen deterministic test (#1023) hai 5 meses
  66RING 9486635c92 Fix typos of comments about shape. (#837) hai 5 meses
  JDKWangGuan 0d810cfb73 Fix KeyError handling for non-existing key in state_dict.pop() (#898) hai 5 meses
  cao lei 6a2a16e994 fix typo (#974) hai 5 meses
  Nicolas Patry 5bf201966a Fixing argument checking when using `seqlenq_ngroups_swapped`. (#976) hai 5 meses
  Liang ab59ec3590 remove swizzle part of `sV.data()` to get a completely non-swizzle `sVtNoSwizzle` (#984) hai 5 meses
  Grigory Sizov f816dee63c Support unpadded LSE layout (#970) hai 5 meses
  Tri Dao 320fb59487 Update citation hai 6 meses