Author | SHA1 Message | Date |
---|---|---|
|
00503b9fc1 feat: non-uniform quantization via `compressed-tensors` for llama | 5 months ago |
|
19340b672e chore: improve min_capability checking for `compressed-tensors` | 5 months ago |
|
500f3b654f fix: support bias term in compressed-tensors quant | 5 months ago |
|
ddb3323f94 refactor: have w8a8 compressed tensors use `process_weights_after_load` for fp8 | 6 months ago |
|
f4ea11b982 feat: initial support for activation quantization | 6 months ago |