Tri Dao
|
f907a13187
Tune tile sizes for fwd varlen on Sm80 and Sm86
|
4 周之前 |
Tri Dao
|
76f14c61c9
Tune fwd tile sizes for Sm86 and Sm89
|
4 周之前 |
Tri Dao
|
5171269dab
Implement forward pass for Sm80
|
1 月之前 |
Tri Dao
|
3f85126149
Use persistent scheduler when paged_kv
|
1 月之前 |
Tri Dao
|
3e5d77a102
Group instantiations for different hdims together
|
1 月之前 |
Tri Dao
|
6807b1ea37
Longest-processing-time-first scheduler for causal
|
1 月之前 |
Tri Dao
|
6293008748
Add option for Mma0_is_RS and Mma1_is_RS in attn fwd
|
1 月之前 |
Tri Dao
|
2c996ca25f
Use SeqlenInfo for bwd and epilogue
|
1 月之前 |
Tri Dao
|
9c954f7021
Use num_split_heuristics in fwd and fwd_varlen
|
2 月之前 |
Tri Dao
|
f6e165becf
Change tile_size and local to avoid wgmma being serialized
|
2 月之前 |
Tri Dao
|
94657af3e8
Add option for not doing intra-WG overlapping of gemm and softmax
|
2 月之前 |
Tri Dao
|
fc2fd95a18
Renable FP8 kernels
|
2 月之前 |
Tri Dao
|
586ba914bb
Move fwd tile size to a separate file
|
2 月之前 |