.. |
aqlm
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 8 meses |
awq
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
hai 10 meses |
bitsandbytes
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 8 meses |
cutlass_w8a8
|
2313c97e3d
add cutlass w8a8 kernels (#556)
|
hai 7 meses |
exl2
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
hai 10 meses |
fp8
|
251568470e
initial nvidia fp8 e4m3 for kv cache
|
hai 7 meses |
gguf
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
hai 10 meses |
gptq
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 8 meses |
gptq_marlin
|
ad1c6b86a1
gptq_marlin: enable bfloat16
|
hai 7 meses |
int8_kvcache
|
9810daa699
feat: INT8 KV Cache (#298)
|
hai 1 ano |
marlin
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
hai 7 meses |
quip
|
aebd68c632
feat: backport kernels (#235)
|
hai 1 ano |
squeezellm
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
hai 1 ano |
quant_ops.cpp
|
2313c97e3d
add cutlass w8a8 kernels (#556)
|
hai 7 meses |
quant_ops.h
|
2313c97e3d
add cutlass w8a8 kernels (#556)
|
hai 7 meses |