jayhshah 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
..
__init__.py 7f67966cc7 FA3 initial code release 8 months ago
benchmark_attn.py dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) 8 months ago
benchmark_flash_attention_fp8.py 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
block_info.h 7f67966cc7 FA3 initial code release 8 months ago
epilogue_fwd_sm90_tma.hpp 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
flash.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) 8 months ago
flash_api.cpp 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
flash_attn_interface.py 3aae9c18c1 Revert "Changes For FP8 (#1075)" 8 months ago
flash_bwd_hdim128_fp16_sm90.cu 7f67966cc7 FA3 initial code release 8 months ago
flash_bwd_hdim256_fp16_sm90.cu 7f67966cc7 FA3 initial code release 8 months ago
flash_bwd_hdim64_fp16_sm90.cu 7f67966cc7 FA3 initial code release 8 months ago
flash_bwd_kernel.h 7f67966cc7 FA3 initial code release 8 months ago
flash_bwd_launch_template.h cb516f855b Remove torchlib dependency from cpp files (#1083) 8 months ago
flash_bwd_preprocess_kernel.h 7f67966cc7 FA3 initial code release 8 months ago
flash_fwd_hdim128_bf16_sm90.cu 74b0761ff7 [FA3] BF16 forward 8 months ago
flash_fwd_hdim128_e4m3_sm90.cu 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
flash_fwd_hdim128_fp16_sm90.cu 7f67966cc7 FA3 initial code release 8 months ago
flash_fwd_hdim256_bf16_sm90.cu 74b0761ff7 [FA3] BF16 forward 8 months ago
flash_fwd_hdim256_e4m3_sm90.cu 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
flash_fwd_hdim256_fp16_sm90.cu 7f67966cc7 FA3 initial code release 8 months ago
flash_fwd_hdim64_bf16_sm90.cu 74b0761ff7 [FA3] BF16 forward 8 months ago
flash_fwd_hdim64_e4m3_sm90.cu 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
flash_fwd_hdim64_fp16_sm90.cu 7f67966cc7 FA3 initial code release 8 months ago
flash_fwd_kernel.h 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
flash_fwd_launch_template.h 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
kernel_traits.h 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
mainloop_fwd_sm90_tma_gmma_ws.hpp 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
named_barrier.hpp 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
seq_len.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) 8 months ago
setup.py 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
softmax.h 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
static_switch.h dfe1a59e4b Add var-seq-len to FA3 fp16 / bf16 fwd (#1072) 8 months ago
test_flash_attn.py 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
tile_scheduler.hpp 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago
utils.h 5018ac6ac5 Fp8 kernel with "in-kernel" transpose of V in producer (#1100) 8 months ago