Autor | SHA1 Mensaxe | Data |
---|---|---|
|
00503b9fc1 feat: non-uniform quantization via `compressed-tensors` for llama | hai 5 meses |
|
ee2c5d34da feat: add fp8 channel-wise weight quantization support | hai 5 meses |
|
98cb1c4cd1 feat: support fp8 via `llm-compressor` | hai 5 meses |
|
e2dbe5f05c feat: add sparse marlin for compressed tensors | hai 6 meses |
|
aba03b4756 feat: dynamic per-token activation quantization | hai 6 meses |