Tri Dao 65f723bb9a Split bwd into more .cu files to speed up compilation 4 месяцев назад
..
composable_kernel @ 8182976c37 d8f104e97a Support AMD ROCm on FlashAttention 2 (#1010) 4 месяцев назад
cutlass @ 756c351b49 74b0761ff7 [FA3] BF16 forward 5 месяцев назад
flash_attn 65f723bb9a Split bwd into more .cu files to speed up compilation 4 месяцев назад
flash_attn_ck d8f104e97a Support AMD ROCm on FlashAttention 2 (#1010) 4 месяцев назад
ft_attention 50896ec574 Make nvcc threads configurable via environment variable (#885) 9 месяцев назад
fused_dense_lib 50896ec574 Make nvcc threads configurable via environment variable (#885) 9 месяцев назад
fused_softmax 50896ec574 Make nvcc threads configurable via environment variable (#885) 9 месяцев назад
layer_norm 50896ec574 Make nvcc threads configurable via environment variable (#885) 9 месяцев назад
rotary 50896ec574 Make nvcc threads configurable via environment variable (#885) 9 месяцев назад
xentropy 50896ec574 Make nvcc threads configurable via environment variable (#885) 9 месяцев назад