Historique des commits

Auteur SHA1 Message Date
  Jay Shah 9b6cba16c1 remove some debug code il y a 2 mois
  Jay Shah dec7dee1b1 fix integer sign compare warning il y a 2 mois
  Jay Shah 50cb90aea6 comment out unimplemented kwargs from flash_attn_with_kvcache il y a 2 mois
  Jay Shah b3d60fa3a5 prune more dead code il y a 2 mois
  Jay Shah 8efb953eeb remove commented out code il y a 2 mois
  Jay Shah a7cce59d25 adjust tolerances in test script for kv cache il y a 2 mois
  Jay Shah c06cc0ba9f change cu_seqlens_k to seqused_k for kv cache api il y a 2 mois
  Jay Shah 7c1473e0e5 remove Is_batch_dynamic from seqlen traits and handle fp8 perf regression using smem boolean il y a 2 mois
  Jay Shah 1ecf821207 remove constexpr checks for actual seqlen in mainloop il y a 2 mois
  Jay Shah 8374e1fa78 remove test code il y a 2 mois
  Jay Shah 35f3542442 refactor names il y a 2 mois
  Jay Shah b0f067efdc revert epi change for fp8 due to measured perf regression il y a 2 mois
  Jay Shah eb9c0ee22a add rmem -> gmem for fp8 il y a 2 mois
  Jay Shah 551b91f4c9 uniform notation il y a 2 mois
  Jay Shah 7169b23399 unify rmem -> gmem methods il y a 2 mois
  Jay Shah ab5d336e61 better writeout logic with vectorization il y a 2 mois
  Jay Shah d437d3dd5c remove smem usage for when rmem -> gmem epilogue is used il y a 2 mois
  Ganesh Bikshandi e49cb5f77c passes except for hdim=256. il y a 2 mois
  Ganesh Bikshandi dc2c952f37 compiles and builes. Not validates. il y a 2 mois
  Ganesh Bikshandi a075e769fb handle gqa_parallel with rmem-to-gmem. Not validating yet. il y a 2 mois
  Jay Shah 4a4dbd29c5 move IsRegToGmem il y a 2 mois
  Jay Shah 8f45a8cfa2 tests passing now for non-gqa impl il y a 2 mois
  Ganesh Bikshandi f0b49460ec changes to use tiledcopy (still not passing). il y a 2 mois
  Ganesh Bikshandi 8fbefa8ac4 adding rmem to gmem. (Not validating yet). il y a 2 mois
  Jay Shah 785d978165 fix bug with fp8 q layout il y a 2 mois
  Jay Shah aa0e699412 move descale tensor declarations outside of conditional il y a 2 mois
  Jay Shah fff4b5c09b add split kv benchmark script il y a 2 mois
  Jay Shah bc4b8722f6 add crude hdim 64 heuristic il y a 2 mois
  Jay Shah 930c8cad98 reorg mma code for less redundancy il y a 2 mois
  Jay Shah 03200a753f removed old gqa cu files and unified methods il y a 2 mois