david/flash-attention: flash-attention from https://github.com/Dao-AILab/flash-attention

Tri Dao 73df3be7d5 Add test for BTLM init		il y a 11 mois
..
test_baichuan.py	3f7d5786ba Pass alibi slopes to flash_attn_with_kvcache during generation	il y a 11 mois
test_bert.py	07005806ff Add BigCode converters (#532)	il y a 1 an
test_bigcode.py	dfe29f5e2b [Gen] Don't use ft_attention, use flash_attn_with_kvcache instead	il y a 1 an
test_btlm.py	73df3be7d5 Add test for BTLM init	il y a 11 mois
test_falcon.py	dfe29f5e2b [Gen] Don't use ft_attention, use flash_attn_with_kvcache instead	il y a 1 an
test_gpt.py	e0fbaa7016 [Gen] Simplify decode_speculative	il y a 1 an
test_gpt_generation_parallel.py	0705d2718d [Llama] Fix some tests, add tests for Llama 2 and CodeLlama	il y a 1 an
test_gpt_neox.py	d0032700d1 Add tests for Pythia, GPT-JT, and RedPajama models	il y a 1 an
test_gpt_parallel.py	0e8c46ae08 Run isort and black on test files	il y a 1 an
test_gptj.py	dfe29f5e2b [Gen] Don't use ft_attention, use flash_attn_with_kvcache instead	il y a 1 an
test_llama.py	0705d2718d [Llama] Fix some tests, add tests for Llama 2 and CodeLlama	il y a 1 an
test_opt.py	dfe29f5e2b [Gen] Don't use ft_attention, use flash_attn_with_kvcache instead	il y a 1 an
test_vit.py	0e8c46ae08 Run isort and black on test files	il y a 1 an