Tri Dao d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
..
__init__.py 7f67966cc7 FA3 initial code release vor 5 Monaten
benchmark_attn.py dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) vor 4 Monaten
benchmark_flash_attention.py cdc966e81a adding files for fp8 changes. vor 4 Monaten
benchmark_flash_attention_fp8.py df66e974bc fixed odd-seq-len-k. vor 4 Monaten
block_info.h 7f67966cc7 FA3 initial code release vor 5 Monaten
epilogue_fwd_sm90_tma.hpp d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
flash.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) vor 4 Monaten
flash_api.cpp d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
flash_attn_interface.py d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
flash_bwd_hdim128_fp16_sm90.cu 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_bwd_hdim256_fp16_sm90.cu 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_bwd_hdim64_fp16_sm90.cu 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_bwd_kernel.h 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_bwd_launch_template.h cb516f855b Remove torchlib dependency from cpp files (#1083) vor 4 Monaten
flash_bwd_preprocess_kernel.h 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_fwd_hdim128_bf16_sm90.cu 74b0761ff7 [FA3] BF16 forward vor 5 Monaten
flash_fwd_hdim128_fp16_sm90.cu 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_fwd_hdim128_fp8_sm90.cu cdc966e81a adding files for fp8 changes. vor 4 Monaten
flash_fwd_hdim256_bf16_sm90.cu 74b0761ff7 [FA3] BF16 forward vor 5 Monaten
flash_fwd_hdim256_fp16_sm90.cu 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_fwd_hdim256_fp8_sm90.cu cdc966e81a adding files for fp8 changes. vor 4 Monaten
flash_fwd_hdim64_bf16_sm90.cu 74b0761ff7 [FA3] BF16 forward vor 5 Monaten
flash_fwd_hdim64_fp16_sm90.cu 7f67966cc7 FA3 initial code release vor 5 Monaten
flash_fwd_hdim64_fp8_sm90.cu cdc966e81a adding files for fp8 changes. vor 4 Monaten
flash_fwd_kernel.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) vor 4 Monaten
flash_fwd_launch_template.h d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
kernel_traits.h fe4c5b59df undid clang formatting. vor 4 Monaten
mainloop_fwd_sm90_tma_gmma_ws.hpp d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
named_barrier.hpp 74b0761ff7 [FA3] BF16 forward vor 5 Monaten
seq_len.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) vor 4 Monaten
setup.py d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
softmax.h 7f67966cc7 FA3 initial code release vor 5 Monaten
static_switch.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) vor 4 Monaten
test_flash_attn.py d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten
tile_scheduler.hpp 74b0761ff7 [FA3] BF16 forward vor 5 Monaten
utils.h d5893f3c74 Merge branch 'main' into changes_for_fp8 vor 4 Monaten