.. |
__init__.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 долоо хоног өмнө |
cpu.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 долоо хоног өмнө |
cuda.py
|
9f3e7c86e2
feat: add fused Marlin MoE kernel (#934)
|
2 долоо хоног өмнө |
interface.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 долоо хоног өмнө |
rocm.py
|
81c28d2a7f
fix: use nvml to get consistent device names (#739)
|
3 сар өмнө |
tpu.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 долоо хоног өмнө |