Tri Dao
|
0c04943fa2
Require CUDA 11.6+, clean up setup.py
|
1 year ago |
Tri Dao
|
b1fbbd8337
Implement splitKV attention
|
1 year ago |
Tri Dao
|
cbb4cf5f46
Don't need to set TORCH_CUDA_ARCH_LIST in setup.py
|
1 year ago |
Aman Gupta Karmani
|
aab603af4f
fix binary wheel installation when nvcc is not available (#448)
|
1 year ago |
Tri Dao
|
9c531bdc0a
Use single thread compilation for cuda12.1, torch2.1 to avoid OOM CI
|
1 year ago |
Tri Dao
|
2ddeaa406c
Fix wheel building
|
1 year ago |
Tri Dao
|
3c458cff77
Merge branch 'feature/demo-wheels' of https://github.com/piercefreeman/flash-attention into piercefreeman-feature/demo-wheels
|
1 year ago |
Tri Dao
|
1c41d2b0e5
Fix race condition in bwd (overwriting sK)
|
1 year ago |
Tri Dao
|
4f285b3547
FlashAttention-2 release
|
1 year ago |
Pierce Freeman
|
9af165c389
Clean setup.py imports
|
1 year ago |
Pierce Freeman
|
494b2aa486
Add notes to github action workflow
|
1 year ago |
Pierce Freeman
|
ea2ed88623
Refactor and clean of setup.py
|
1 year ago |
Pierce Freeman
|
9fc9820a5b
Strip cuda name from torch version
|
1 year ago |
Pierce Freeman
|
5e4699782a
Allow fallback install
|
1 year ago |
Pierce Freeman
|
0e7769c813
Guessing wheel URL
|
1 year ago |
Pierce Freeman
|
e1faefce9d
Raise cuda error on build
|
1 year ago |
Pierce Freeman
|
add4f0bc42
Scaffolding for wheel prototype
|
1 year ago |
Max H. Gerlach
|
31f78a9814
Allow adding an optional local version to the package version
|
1 year ago |
Tri Dao
|
eff9fe6b80
Add ninja to pyproject.toml build-system, bump to v1.0.5
|
1 year ago |
Tri Dao
|
ad113948a6
[Docs] Clearer error message for bwd d > 64, bump to v1.0.4
|
1 year ago |
Tri Dao
|
fbbb107848
Bump version to v1.0.3.post0
|
1 year ago |
Tri Dao
|
67ef5d28df
Bump version to 1.0.3
|
1 year ago |
Tri Dao
|
df1344f866
Bump to v1.0.2
|
1 year ago |
Pavel Shvets
|
72629ac9ba
add missed module
|
1 year ago |
Tri Dao
|
853ff72963
Bump version to v1.0.1, fix Cutlass version
|
1 year ago |
Tri Dao
|
74af023316
Bump version to 1.0.0
|
1 year ago |
Tri Dao
|
dc08ea1c33
Support H100 for other CUDA extensions
|
1 year ago |
Tri Dao
|
1b18f1b7a1
Support H100
|
1 year ago |
Tri Dao
|
33e0860c9c
Bump to v0.2.8
|
1 year ago |
Tri Dao
|
d509832426
[Compilation] Add _NO_HALF2 flags to be consistent with Pytorch
|
1 year ago |