Commit History

Author SHA1 Message Date
  Ying Zhang cdbbe844b1 minor changes to unpad_input test util func 3 months ago
  Tri Dao cc1690d9d6 [Rotary] Add test for rotary when qkv are packed an there's GQA 3 months ago
  Ivan Komarov f692b98d80 Fix spurious re-compilations of `rotary_kernel` (#911) 8 months ago
  Tri Dao b28ec236df [Rotary] Implement varlen rotary 1 year ago
  Tri Dao 1c523c1ce1 [Rotary] Speed up rotary kernel when interleaved=True 1 year ago
  Tri Dao 942fcbf046 [Rotary] Implement rotary in Triton 1 year ago
  Tri Dao 0e8c46ae08 Run isort and black on test files 1 year ago
  Tri Dao d4b320b31f Add MLP, MHA, Block, Embedding modules 2 years ago
  Tri Dao ca81f32e04 Implement rotary embedding in CUDA 2 years ago