david/flash-attention: flash-attention from https://github.com/Dao-AILab/flash-attention @ 26a6e0a0485f46716fc6aae6937919e03558a04d

SueJane 3f1b4d38e7 Fix: check the type of max_seqlen_k instead of checking max_seqlen twice (#1127)		4 miesięcy temu
..
__init__.py	ece539abd6 Add __init__.py files to subdirectories for installation	2 lat temu
block.py	abbc131173 [LayerNorm] Switch from CUDA to Triton implementation	11 miesięcy temu
embedding.py	f1a73d0740 Run isort and black on python files	1 rok temu
mha.py	3f1b4d38e7 Fix: check the type of max_seqlen_k instead of checking max_seqlen twice (#1127)	4 miesięcy temu
mlp.py	c3b2196652 Add Alibi to MHA, test with Baichuan-13B	1 rok temu