Jay Shah
|
bb230b8c54
separate gqa compilation
|
2 months ago |
Jay Shah
|
49f1849a60
lower rtol for fp8 a bit
|
2 months ago |
Jay Shah
|
b8f9dc206c
change fp8 tolerances to be smaller
|
2 months ago |
Ganesh Bikshandi
|
c0c58ee111
Revert "re-commiting."
|
2 months ago |
Ganesh Bikshandi
|
a44596fb72
re-commiting.
|
2 months ago |
Jay Shah
|
eaf8898b96
fix submodule
|
2 months ago |
Ganesh Bikshandi
|
0085f04b6a
add fp8 test case.
|
2 months ago |
Jay Shah
|
6bb109238f
change seq len class per discussion
|
2 months ago |
Jay Shah
|
31c71e0218
add log max splits based on num splits to static switch
|
2 months ago |
Jay Shah
|
794037793b
correct indent
|
2 months ago |
Jay Shah
|
16eb1e53fd
remove deprecated fp8 code
|
2 months ago |
Jay Shah
|
5e3864f2ee
change default output type of fp8 kernel to bf16
|
2 months ago |
Jay Shah
|
f77d9f7d6a
fix composable kernel issue again
|
2 months ago |
Jay Shah
|
33f20a3644
Merge branch 'fa3-kvcache-gqa' of github.com:Dao-AILab/flash-attention into fa3-kvcache-gqa
|
2 months ago |
Jay Shah
|
aa45d75f64
dont write out zero for split kernel, only lse=-inf
|
2 months ago |
Ganesh Bikshandi
|
5df67d2732
Merge branch 'fa3-kvcache-gqa' of github.com:Dao-AILab/flash-attention into fa3-kvcache-gqa
|
2 months ago |
Ganesh Bikshandi
|
2b840ef32d
fix the test case and re-factor too.
|
2 months ago |
Jay Shah
|
cffef153de
separate out fp8 in test_flash_attn
|
2 months ago |
Jay Shah
|
0f560b7d3c
update composable kernel
|
2 months ago |
Jay Shah
|
64a0a91fe9
enable Is_local with fp8
|
2 months ago |
Jay Shah
|
81d402463e
prune unused code
|
2 months ago |
Jay Shah
|
be481cac27
add Is_local back in
|
2 months ago |
Jay Shah
|
6111666130
consolidate nblock min max methods
|
2 months ago |
Jay Shah
|
b5cac6d586
rebase with Is_local disabled temporarily
|
2 months ago |
Jay Shah
|
2472e5e0b4
add 'in principle' fp8 kv cache support
|
2 months ago |
Ganesh Bikshandi
|
cd55fb3b5d
set correct tolerance limit
|
2 months ago |
Jay Shah
|
e36e004cb3
change fp8 code path to allow for split kernel and kv cache without perf regression
|
2 months ago |
Ganesh Bikshandi
|
70ff847363
all cases passed.
|
2 months ago |
Ganesh Bikshandi
|
9a4941cb35
add variable seqlen case.
|
2 months ago |
Ganesh Bikshandi
|
c516d6349d
Adding another test case.
|
2 months ago |