Commit History

Автор SHA1 Съобщение Дата
  Tri Dao 39afd52bd2 Actually fix window_size for bwd pass преди 3 дни
  Tri Dao a44cd67d3f Move testing util functions to a separate file преди 3 дни
  Tri Dao 82abd8daca Don't disable window size when is_causal==true for bwd pass преди 3 дни
  Tri Dao a609d82315 Change extension name to flash_attn_3_cuda преди 3 дни
  Tri Dao 5acb532214 Switch to cutlass v3.6.0, fix perf regression for hdim 128 causal преди 3 дни
  Tri Dao 0890032358 Implement backward pass for Sm80 преди 4 дни
  Tri Dao a53f7380b6 Don't disable window_size if is_causal=true преди 4 дни
  Tri Dao f907a13187 Tune tile sizes for fwd varlen on Sm80 and Sm86 преди 4 дни
  Tri Dao 76f14c61c9 Tune fwd tile sizes for Sm86 and Sm89 преди 4 дни
  Tri Dao c4c624f868 Rename bwd epilogue file преди 1 седмица
  Tri Dao 51484a7b56 Make backward epilogue work for Sm80 преди 1 седмица
  Tri Dao 14894c5717 Make BwdPostprocessKernel work with Sm80 преди 1 седмица
  Tri Dao 659a631f4c Rename bwd classes to include Sm90 suffix преди 1 седмица
  Tri Dao 1fba7b499f Merge mha_fwd, mha_varlen_fwd, mha_fwd_kvcache C++ interface преди 1 седмица
  Tri Dao a901c7eeda Make Sm80 forward pass work with persistent scheduler преди 1 седмица
  Tri Dao 65a0f59ef5 Change CP_ASYNC_CACHEGLOBAL to CP_ASYNC_CACHEGLOBAL_ZFILL for compat преди 1 седмица
  Tri Dao b16d814c62 Revert to before Cutlass 3.6.0 update to investigate perf issue преди 1 седмица
  Tri Dao 2ba29df99e Fix hanging when using AppendKV with persistent scheduler преди 1 седмица
  Tri Dao 8ec230f833 Fix to compile with Cutlass 3.6.0 преди 1 седмица
  Tri Dao 64e6e0a09d Switch to Cutlass 3.6.0 official release преди 2 седмици
  Tri Dao c93451d5f8 Fix causal using n_block_min instead of n_block_min_causal_local_mas преди 2 седмици
  Tri Dao 6863fde13f Fix bug in paged KV overshooting kBlockN in smem преди 2 седмици
  Tri Dao 5171269dab Implement forward pass for Sm80 преди 2 седмици
  Tri Dao da264e5742 Change file names and class names to include sm90 suffix преди 2 седмици
  Tri Dao 111ee9d478 Add back gemm_sm80 to utils, make copy work with has_with_bool преди 2 седмици
  Tri Dao 5f25b9781f Make epilogue_fwd work for Ampere преди 2 седмици
  Tri Dao 69bd392159 Merge bwd and bwd_varlen in the C++ API преди 2 седмици
  Tri Dao c3cdc0fd88 Add sm_margin as an option for overlapping with communication преди 2 седмици
  Tri Dao 3f85126149 Use persistent scheduler when paged_kv преди 2 седмици
  Tri Dao 147ac33a2e Tune num_splits for local, don't split when num_n_blocks is small преди 2 седмици