Tri Dao
|
6807b1ea37
Longest-processing-time-first scheduler for causal
|
23 hours ago |
Tri Dao
|
6293008748
Add option for Mma0_is_RS and Mma1_is_RS in attn fwd
|
1 week ago |
Tri Dao
|
e8a1edbeb2
Clean up some #include
|
2 weeks ago |
Tri Dao
|
df96486c31
Decode: varlen, paged KV, leftpad
|
1 month ago |
Tri Dao
|
6e8b25e426
Refactor
|
2 months ago |
jayhshah
|
c92ca63268
FA3 FP8 qkv descales + restore max offset for h128 causal + added sync for producer WG (#1173)
|
3 months ago |
Tri Dao
|
bafe253042
[FA3] Bwd
|
4 months ago |
jayhshah
|
5018ac6ac5
Fp8 kernel with "in-kernel" transpose of V in producer (#1100)
|
4 months ago |
Tri Dao
|
7f67966cc7
FA3 initial code release
|
5 months ago |