.. |
__init__.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 hete |
cpu.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 hete |
cuda.py
|
9f3e7c86e2
feat: add fused Marlin MoE kernel (#934)
|
2 hete |
interface.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 hete |
rocm.py
|
81c28d2a7f
fix: use nvml to get consistent device names (#739)
|
3 hónapja |
tpu.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 hete |