Ying Zhang
|
cdbbe844b1
minor changes to unpad_input test util func
|
3 months ago |
Tri Dao
|
cc1690d9d6
[Rotary] Add test for rotary when qkv are packed an there's GQA
|
3 months ago |
Ivan Komarov
|
f692b98d80
Fix spurious re-compilations of `rotary_kernel` (#911)
|
8 months ago |
Tri Dao
|
b28ec236df
[Rotary] Implement varlen rotary
|
1 year ago |
Tri Dao
|
1c523c1ce1
[Rotary] Speed up rotary kernel when interleaved=True
|
1 year ago |
Tri Dao
|
942fcbf046
[Rotary] Implement rotary in Triton
|
1 year ago |
Tri Dao
|
0e8c46ae08
Run isort and black on test files
|
1 year ago |
Tri Dao
|
d4b320b31f
Add MLP, MHA, Block, Embedding modules
|
2 years ago |
Tri Dao
|
ca81f32e04
Implement rotary embedding in CUDA
|
2 years ago |