Tri Dao
|
68bf390920
Update Cutlass to fix mem fence
|
2 dienas atpakaļ |
Tri Dao
|
7a802796e1
Big refactor and update
|
2 dienas atpakaļ |
Driss Guessous
|
36dddb891c
Remove unused 224 configs (#1425)
|
3 dienas atpakaļ |
XiaobingZhang
|
0dfb281743
don't save inputs buffer of FlashAttenFunc to reduce memory usage for inference mode (#1383)
|
4 nedēļas atpakaļ |
Tri Dao
|
f86e3dd919
[CI] Use MAX_JOBS=1 with nvcc 12.3, don't need OLD_GENERATOR_PATH
|
1 mēnesi atpakaļ |
Tri Dao
|
b7d29fb3b7
Bump to v2.7.2
|
1 mēnesi atpakaļ |
Tri Dao
|
d3b1cd1c13
[CI] Use MAX_JOBS=2 even with nvcc 12.3
|
1 mēnesi atpakaļ |
Tri Dao
|
61a23ea8a2
Update to Cutlass 3.6.0
|
1 mēnesi atpakaļ |
Tri Dao
|
9375ac9322
[CI] Don't include <ATen/cuda/CUDAGraphsUtils.cuh>
|
1 mēnesi atpakaļ |
Tri Dao
|
e782d28692
[CI] Change torch #include to make it work with torch 2.1 Philox
|
1 mēnesi atpakaļ |
Tri Dao
|
073afd5931
[CI] Use torch 2.6.0.dev20241001, reduce torch #include
|
1 mēnesi atpakaļ |
Tri Dao
|
cf0f4c38ef
[CI] Fix CUDA version for torch 2.6
|
1 mēnesi atpakaļ |
Tri Dao
|
cc408f9fbf
Bump to v2.7.1
|
1 mēnesi atpakaļ |
Tri Dao
|
88786928ed
[CI] Drop Python 3.8, add Python 3.13, add pytorch 2.6.0.dev20241010
|
1 mēnesi atpakaļ |
Adam Louly
|
df1a744887
Add how to import FA3 to documentation (#1112)
|
1 mēnesi atpakaļ |
Michael Melesse
|
b518517cb8
[AMD] Triton Backend for ROCm (#1203)
|
1 mēnesi atpakaļ |
sclarkson
|
1feb711f46
Fix compilation with clang on ARM64 (#1285)
|
1 mēnesi atpakaļ |
Kai Londenberg
|
0823cf7b5d
Fix FA3 Varlen Performance regression (#1361)
|
1 mēnesi atpakaļ |
Alexander Gessler
|
ca71144d59
flash_bwd_kernel.h: add maybe_unused annotation to suppress lengthy warnings about dtanh not being used in some compile permutations. (#1363)
|
1 mēnesi atpakaļ |
Neil Tenenholtz
|
7153673c1a
Fix swiglu backwards return type (#1337)
|
1 mēnesi atpakaļ |
Tri Dao
|
641db759ab
[CI] Pytorch 2.5.1 does not support python 3.8
|
1 mēnesi atpakaļ |
Tri Dao
|
7435839e3d
Update README for FA3
|
1 mēnesi atpakaļ |
Tri Dao
|
241c682c9f
[CI] Switch back to CUDA 12.4
|
1 mēnesi atpakaļ |
Tri Dao
|
c555642172
Bump to v2.7.0
|
1 mēnesi atpakaļ |
Tri Dao
|
6ffeb572b1
[CI] Still use CUDA 12.3 but pull the right pytorch version
|
1 mēnesi atpakaļ |
Ethan Steinberg
|
42f2b8be34
Use CUDA 12.4 in the build system (#1326)
|
1 mēnesi atpakaļ |
Tri Dao
|
2f6c633179
Drop support for Pytorch 2.0
|
1 mēnesi atpakaļ |
rocking
|
88d1657a14
[AMD ROCm] Fix KVcache bug and improve performance (#1328)
|
1 mēnesi atpakaļ |
Kai Londenberg
|
284e2c6e5b
Make FA3 paged attention ready for upgrade to Cutlass 3.6 (#1331)
|
1 mēnesi atpakaļ |
Kai Londenberg
|
b443207c1f
Paged Attention support for FA3 (#1268)
|
2 mēneši atpakaļ |