.. |
__init__.py
|
98cb1c4cd1
feat: support fp8 via `llm-compressor`
|
hai 5 meses |
marlin_utils.py
|
141672a0d4
kernels: disambiguate quantized types via a new ScalarType
|
hai 5 meses |
marlin_utils_fp8.py
|
4ad2117242
feat: `fp8-marlin` channel-wise quant via `compressed-tensors`
|
hai 5 meses |
marlin_utils_test.py
|
141672a0d4
kernels: disambiguate quantized types via a new ScalarType
|
hai 5 meses |
marlin_utils_test_24.py
|
141672a0d4
kernels: disambiguate quantized types via a new ScalarType
|
hai 5 meses |
marlin_utils_test_qqq.py
|
e3f07b22c3
feat: support for QQQ W4A8 quantization (#612)
|
hai 5 meses |
quant_utils.py
|
141672a0d4
kernels: disambiguate quantized types via a new ScalarType
|
hai 5 meses |
w8a8_utils.py
|
869ad77843
fix: remove scaled_fp8_quant_kernel padding footgun
|
hai 5 meses |