.. |
__init__.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 week ago |
cpu.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 week ago |
cuda.py
|
9f3e7c86e2
feat: add fused Marlin MoE kernel (#934)
|
2 weeks ago |
interface.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 week ago |
rocm.py
|
81c28d2a7f
fix: use nvml to get consistent device names (#739)
|
3 months ago |
tpu.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
1 week ago |