.. |
aqlm
|
705821a7fe
feat: AQLM quantization support (#293)
|
10 月之前 |
awq
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
9 月之前 |
bitsandbytes
|
a98babfb74
fix: bnb on Turing GPUs (#299)
|
10 月之前 |
exl2
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
9 月之前 |
fp8
|
071269e406
feat: FP8 E4M3 KV Cache (#405)
|
9 月之前 |
fp8_e5m2_kvcache
|
8e1cd54497
fix: do not include fp8 for rocm (#271)
|
10 月之前 |
gguf
|
89c32b40ec
chore: add new imatrix quants (#320)
|
10 月之前 |
gptq
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
9 月之前 |
int8_kvcache
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 月之前 |
marlin
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
9 月之前 |
quip
|
aebd68c632
feat: backport kernels (#235)
|
11 月之前 |
squeezellm
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 月之前 |