Tri Dao
|
8c20cfef49
[Rotary] Support qkv block layout from GQA
|
há 3 meses atrás |
Ivan Komarov
|
f692b98d80
Fix spurious re-compilations of `rotary_kernel` (#911)
|
há 8 meses atrás |
Tri Dao
|
8a733cbd53
[Gen] Fix calling update_graph_cache in tests
|
há 1 ano atrás |
Tri Dao
|
9795159082
[Rotary] Set device before launching Triton kernel to avoid error
|
há 1 ano atrás |
Tri Dao
|
b28ec236df
[Rotary] Implement varlen rotary
|
há 1 ano atrás |
Tri Dao
|
861c82577d
[Rotary] Clean up rotary Triton implementation a bit
|
há 1 ano atrás |
Tri Dao
|
1c523c1ce1
[Rotary] Speed up rotary kernel when interleaved=True
|
há 1 ano atrás |
Tri Dao
|
942fcbf046
[Rotary] Implement rotary in Triton
|
há 1 ano atrás |