.. |
backends
|
f6250c5516
move dockerfiles to root; fix cpu build
|
7 ヶ月 前 |
ops
|
8e11259e90
missing triton autoconfig for rocm flash attn
|
7 ヶ月 前 |
__init__.py
|
a94de94c44
refactor: combine the prefill and decode into a single API (#553)
|
7 ヶ月 前 |
layer.py
|
656459fd84
make fp8_e4m3 work on nvidia
|
7 ヶ月 前 |
selector.py
|
19a959a03e
prioritize user selection for attention
|
7 ヶ月 前 |