Commit History

Author SHA1 Message Date
  Tri Dao 5acb532214 Switch to cutlass v3.6.0, fix perf regression for hdim 128 causal 2 weeks ago
  Tri Dao 65a0f59ef5 Change CP_ASYNC_CACHEGLOBAL to CP_ASYNC_CACHEGLOBAL_ZFILL for compat 4 weeks ago
  Tri Dao 8ec230f833 Fix to compile with Cutlass 3.6.0 4 weeks ago
  Tri Dao 6863fde13f Fix bug in paged KV overshooting kBlockN in smem 1 month ago
  Tri Dao 94657af3e8 Add option for not doing intra-WG overlapping of gemm and softmax 2 months ago
  Tri Dao fe412d6b36 Redo rotary when contiguous 2 months ago
  Tri Dao b2d3fe92ff Move rotary to a separate file 2 months ago
  Tri Dao 4d00645c76 Implement appending new KV to KV cache 2 months ago
  Tri Dao d00b88ee05 Move PagedKV to a separate file 2 months ago