david/flash-attention

mirror of https://github.com/Dao-AILab/flash-attention

Author	SHA1 Message	Date
Tri Dao	e45a46a5b7 [Rotary] Implement GPT-J style (interleaved) rotary	1 year ago
Tri Dao	1e712ea8b0 Implement TensorParallel for MHA	2 years ago
Tri Dao	ca81f32e04 Implement rotary embedding in CUDA	2 years ago