david/flash-attention: flash-attention from https://github.com/Dao-AILab/flash-attention @ changes_for_fp8

espejo de https://github.com/Dao-AILab/flash-attention

Markus Krimmel 6bbc532388 fix: cast the alibi slopes to torch.float32 (#846)		hace 9 meses
..
__init__.py	ece539abd6 Add __init__.py files to subdirectories for installation	hace 2 años
block.py	abbc131173 [LayerNorm] Switch from CUDA to Triton implementation	hace 1 año
embedding.py	f1a73d0740 Run isort and black on python files	hace 1 año
mha.py	6bbc532388 fix: cast the alibi slopes to torch.float32 (#846)	hace 9 meses
mlp.py	c3b2196652 Add Alibi to MHA, test with Baichuan-13B	hace 1 año