.. |
layers
|
a86442f0f3
[Gen] Use flash_attn_with_kvcache in generation
|
1 year ago |
losses
|
5400fdc4ac
[CE] Implement CrossEntropyLoss in Triton
|
1 year ago |
models
|
d0032700d1
Add tests for Pythia, GPT-JT, and RedPajama models
|
1 year ago |
modules
|
8a733cbd53
[Gen] Fix calling update_graph_cache in tests
|
1 year ago |
ops
|
5400fdc4ac
[CE] Implement CrossEntropyLoss in Triton
|
1 year ago |
utils
|
a86442f0f3
[Gen] Use flash_attn_with_kvcache in generation
|
1 year ago |
__init__.py
|
08c295c043
Bump to v2.2.2
|
1 year ago |
bert_padding.py
|
8f6f48d8a8
add unpad_input_for_concatenated_sequences (#499)
|
1 year ago |
flash_attn_interface.py
|
ccbb14f38e
Implement rotary embedding in flash_attn_with_kvcache
|
1 year ago |
flash_attn_triton.py
|
f1a73d0740
Run isort and black on python files
|
1 year ago |
flash_attn_triton_og.py
|
f1a73d0740
Run isort and black on python files
|
1 year ago |
flash_blocksparse_attention.py
|
f1a73d0740
Run isort and black on python files
|
1 year ago |
flash_blocksparse_attn_interface.py
|
f1a73d0740
Run isort and black on python files
|
1 year ago |
fused_softmax.py
|
f1a73d0740
Run isort and black on python files
|
1 year ago |
pyproject.toml
|
73bd3f3bbb
Move pyproject.toml to flash-attn and tests dir to avoid PEP 517
|
1 year ago |