AlpinDale 0307da9e15 refactor: bitsandbytes -> autoquant 7 mesi fa
..
aqlm 3bdeb3e116 fix: clang formatting for all kernels (#558) 7 mesi fa
autoquant 0307da9e15 refactor: bitsandbytes -> autoquant 7 mesi fa
awq 9d81716bfd [v0.5.3] Release Candidate (#388) 10 mesi fa
compressed_tensors f4ea11b982 feat: initial support for activation quantization 7 mesi fa
cutlass_w8a8 f2c6791527 feat: update cutlass fp8 configs 7 mesi fa
exl2 9d81716bfd [v0.5.3] Release Candidate (#388) 10 mesi fa
fp8 3bdeb3e116 fix: clang formatting for all kernels (#558) 7 mesi fa
gguf 9d81716bfd [v0.5.3] Release Candidate (#388) 10 mesi fa
gptq 3bdeb3e116 fix: clang formatting for all kernels (#558) 7 mesi fa
gptq_marlin 3bdeb3e116 fix: clang formatting for all kernels (#558) 7 mesi fa
int8_kvcache 9810daa699 feat: INT8 KV Cache (#298) 1 anno fa
marlin d8667fcb98 improve gptq_marlin_24 prefill performance 7 mesi fa
quip aebd68c632 feat: backport kernels (#235) 1 anno fa
squeezellm 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 1 anno fa
quant_ops.cpp f4ea11b982 feat: initial support for activation quantization 7 mesi fa
quant_ops.h f4ea11b982 feat: initial support for activation quantization 7 mesi fa