.. |
__init__.py
|
98cb1c4cd1
feat: support fp8 via `llm-compressor`
|
hace 5 meses |
compressed_tensors_scheme.py
|
19340b672e
chore: improve min_capability checking for `compressed-tensors`
|
hace 5 meses |
compressed_tensors_unquantized.py
|
00503b9fc1
feat: non-uniform quantization via `compressed-tensors` for llama
|
hace 5 meses |
compressed_tensors_w4a16_24.py
|
19340b672e
chore: improve min_capability checking for `compressed-tensors`
|
hace 5 meses |
compressed_tensors_w8a8_fp8.py
|
d3c474d219
chore: enable dynamic per-token `fp8`
|
hace 5 meses |
compressed_tensors_w8a8_int8.py
|
19340b672e
chore: improve min_capability checking for `compressed-tensors`
|
hace 5 meses |
compressed_tensors_wNa16.py
|
ba371fbbbd
feat: AWQ marlin kernels (#603)
|
hace 5 meses |