Tri Dao
|
e2e4333c95
Limit to MAX_JOBS=1 with CUDA 12.2
|
6 달 전 |
Tri Dao
|
ce73503578
Bump to 2.5.9
|
6 달 전 |
Tri Dao
|
d732be1e67
Update to Cutlass 3.5
|
6 달 전 |
Tri Dao
|
af627063e3
[CI] Compile for pytorch 2.4.0.dev20240407 (for nvcr 24.05)
|
6 달 전 |
Wongboo
|
40e667236c
Update for python3.12 (#870)
|
6 달 전 |
Corey James Levinson
|
beb8b8ba9f
add exception to Timeout Error (#963)
|
6 달 전 |
lancerts
|
22339db185
remove an unused import (#960)
|
6 달 전 |
Wei Ji
|
9c0e9ee86d
Move packaging and ninja from install_requires to setup_requires (#937)
|
7 달 전 |
Tri Dao
|
9a11f440d3
Bump to v2.5.8
|
7 달 전 |
Tri Dao
|
35060e7450
[CI] Compile for pytorch 2.2.2 and 2.3.0
|
7 달 전 |
Tri Dao
|
ec6d22143b
[CrossEntropy] Change ignored_index -> ignore_index
|
7 달 전 |
Tri Dao
|
85881f547f
Bump to v2.5.7
|
8 달 전 |
Tri Dao
|
2aea958f89
[CI] Compile with torch 2.3.0.dev20240207
|
8 달 전 |
Tri Dao
|
656daef4ea
Use Cute's local_tile to get gQ, gK, gV
|
8 달 전 |
Tri Dao
|
9eb3d099c1
Transpose out when swapping seqlen_q and num_groups
|
8 달 전 |
Ivan Komarov
|
f692b98d80
Fix spurious re-compilations of `rotary_kernel` (#911)
|
8 달 전 |
Driss Guessous
|
23e8fa5a26
Add the option for the macro and note (#893)
|
8 달 전 |
ljss
|
3e9414f1c3
Minor fix in compute_attn_1rowblock_splitkv (#900)
|
8 달 전 |
Tri Dao
|
36587c01cb
[LayerNorm] Update layer_norm_linear
|
9 달 전 |
Markus Krimmel
|
6bbc532388
fix: cast the alibi slopes to torch.float32 (#846)
|
9 달 전 |
Driss Guessous
|
4a73e903da
Add in, macrosf for defining __grid_constant__ (#852)
|
9 달 전 |
Grigory Sizov
|
2a15840f09
Enable paged attention in varlen forward (#831)
|
9 달 전 |
Arvind Sundararajan
|
26c9e82743
Support ARM builds (#757)
|
9 달 전 |
Chirag Jain
|
50896ec574
Make nvcc threads configurable via environment variable (#885)
|
9 달 전 |
Tri Dao
|
6c9e60de56
Bump to v2.5.6
|
9 달 전 |
Tri Dao
|
6e2fa30797
[CI] Change torch 2.3.0.dev20240126 to 20240105 for nvcr 24.02
|
9 달 전 |
Tri Dao
|
87a1277653
Bump to v2.5.5
|
9 달 전 |
Tri Dao
|
2406f28805
Enable headdim 256 backward on consumer GPUs (Ampere, Ada)
|
9 달 전 |
Tri Dao
|
43950dda45
Bump to v2.5.4
|
9 달 전 |
Tri Dao
|
4d6b794b3c
Update Cutlass to v3.4.1
|
9 달 전 |