david/flash-attention

mirror da https://github.com/Dao-AILab/flash-attention

Autore	SHA1 Messaggio	Data
Tri Dao	f1a73d0740 Run isort and black on python files	1 anno fa
Xuechen Li	bb4cded17b support when num_heads is not divisible by world_size; resolves #459 (#461)	1 anno fa
Tri Dao	93383bd55b [TP] Implement TensorParallel without sequence parallel	1 anno fa
Tri Dao	c6ecd40a59 Tweak CrossEntropyLoss to take process_group in init	2 anni fa
Tri Dao	b4018a5028 Implement Tensor Parallel for GPT model	2 anni fa
Tri Dao	226a1b721d Implement TensorParallel for FusedDense and FusedDenseGeluDense	2 anni fa