Tri Dao ccbb14f38e Implement rotary embedding in flash_attn_with_kvcache 1 year ago
..
layers a86442f0f3 [Gen] Use flash_attn_with_kvcache in generation 1 year ago
losses 5400fdc4ac [CE] Implement CrossEntropyLoss in Triton 1 year ago
models d0032700d1 Add tests for Pythia, GPT-JT, and RedPajama models 1 year ago
modules 8a733cbd53 [Gen] Fix calling update_graph_cache in tests 1 year ago
ops 5400fdc4ac [CE] Implement CrossEntropyLoss in Triton 1 year ago
utils a86442f0f3 [Gen] Use flash_attn_with_kvcache in generation 1 year ago
__init__.py 08c295c043 Bump to v2.2.2 1 year ago
bert_padding.py 8f6f48d8a8 add unpad_input_for_concatenated_sequences (#499) 1 year ago
flash_attn_interface.py ccbb14f38e Implement rotary embedding in flash_attn_with_kvcache 1 year ago
flash_attn_triton.py f1a73d0740 Run isort and black on python files 1 year ago
flash_attn_triton_og.py f1a73d0740 Run isort and black on python files 1 year ago
flash_blocksparse_attention.py f1a73d0740 Run isort and black on python files 1 year ago
flash_blocksparse_attn_interface.py f1a73d0740 Run isort and black on python files 1 year ago
fused_softmax.py f1a73d0740 Run isort and black on python files 1 year ago
pyproject.toml 73bd3f3bbb Move pyproject.toml to flash-attn and tests dir to avoid PEP 517 1 year ago