Commit Verlauf

Autor SHA1 Nachricht Datum
  Tri Dao 5acb532214 Switch to cutlass v3.6.0, fix perf regression for hdim 128 causal vor 2 Wochen
  Tri Dao 65a0f59ef5 Change CP_ASYNC_CACHEGLOBAL to CP_ASYNC_CACHEGLOBAL_ZFILL for compat vor 4 Wochen
  Tri Dao 8ec230f833 Fix to compile with Cutlass 3.6.0 vor 4 Wochen
  Tri Dao 6863fde13f Fix bug in paged KV overshooting kBlockN in smem vor 1 Monat
  Tri Dao 94657af3e8 Add option for not doing intra-WG overlapping of gemm and softmax vor 2 Monaten
  Tri Dao fe412d6b36 Redo rotary when contiguous vor 2 Monaten
  Tri Dao b2d3fe92ff Move rotary to a separate file vor 2 Monaten
  Tri Dao 4d00645c76 Implement appending new KV to KV cache vor 2 Monaten
  Tri Dao d00b88ee05 Move PagedKV to a separate file vor 2 Monaten