Tri Dao
|
601b4dc48d
Bump to v2.3.0
|
1 year ago |
Tri Dao
|
0a1d03c7ea
Bump to v2.2.5
|
1 year ago |
Tri Dao
|
bff3147175
Re-enable compilation for Hopper
|
1 year ago |
Tri Dao
|
229080b9d2
Bump to v2.2.4
|
1 year ago |
Tri Dao
|
43617deab9
Remove template for (IsEvenMN=T, IsEvenK=F) to speed up compilation
|
1 year ago |
Tri Dao
|
799f56fa90
Don't compile for Pytorch 2.1 on CUDA 12.1 due to nvcc segfaults
|
1 year ago |
Tri Dao
|
c984208ddb
Set block size to 64 x 64 for kvcache to avoid nvcc segfaults
|
1 year ago |
Tri Dao
|
8c8b4d36e1
Bump to v2.2.3
|
1 year ago |
Tri Dao
|
08c295c043
Bump to v2.2.2
|
1 year ago |
Tri Dao
|
a1576ad1e8
Bump to v2.2.1
|
1 year ago |
Tri Dao
|
6d673cd961
Bump to v2.2.0
|
1 year ago |
Tri Dao
|
4976650f74
Set single threaded compilation for CUDA 12.2 so CI doesn't OOM
|
1 year ago |
Tri Dao
|
6a89b2f121
Remove constexpr in launch template to fix CI compilation
|
1 year ago |
Tri Dao
|
97ba7a62e9
Try switching back to Cutlass 3.2.0
|
1 year ago |
Tri Dao
|
1dc1b6c8f2
Bump to v2.1.2
|
1 year ago |
Tri Dao
|
757058d4d3
Update Cutlass to v3.2.0
|
1 year ago |
Tri Dao
|
9e5e8bc91e
Change causal mask to be aligned to bottom-right instead of top-left
|
1 year ago |
Tri Dao
|
6711b3bc40
Bump version to 2.0.9
|
1 year ago |
Tri Dao
|
c5e87b11e9
Bump to v2.0.5
|
1 year ago |
Tri Dao
|
d30f2e1cd5
Bump to v2.0.4
|
1 year ago |
Tri Dao
|
a4e5d1eddd
Bump to v2.0.3
|
1 year ago |
Kirthi Shankar Sivamani
|
32a953f486
Request for v2.0.2 (#388)
|
1 year ago |
Tri Dao
|
b252072409
Bump to v2.0.1
|
1 year ago |
chuanli11
|
30fd8c17d8
remove checkout v2.0.0.post1 from dockerfile
|
1 year ago |
Tri Dao
|
4f285b3547
FlashAttention-2 release
|
1 year ago |
Tri Dao
|
6d48e14a6c
Bump to v1.0.9
|
1 year ago |
Tri Dao
|
9610114ce8
Bump to v1.0.8
|
1 year ago |
Tri Dao
|
85b51d61ee
Bump version to 1.0.7
|
1 year ago |
Kirthi Shankar Sivamani
|
dd9c3a1fc2
bump to v1.0.6
|
1 year ago |
Tri Dao
|
eff9fe6b80
Add ninja to pyproject.toml build-system, bump to v1.0.5
|
1 year ago |