.. |
aqlm
|
fca911ee0a
vLLM Upstream Sync (#526)
|
8 maanden geleden |
awq
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
10 maanden geleden |
bitsandbytes
|
fca911ee0a
vLLM Upstream Sync (#526)
|
8 maanden geleden |
exl2
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
10 maanden geleden |
fp8
|
251568470e
initial nvidia fp8 e4m3 for kv cache
|
7 maanden geleden |
gguf
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
10 maanden geleden |
gptq
|
fca911ee0a
vLLM Upstream Sync (#526)
|
8 maanden geleden |
gptq_marlin
|
ad1c6b86a1
gptq_marlin: enable bfloat16
|
7 maanden geleden |
int8_kvcache
|
9810daa699
feat: INT8 KV Cache (#298)
|
1 jaar geleden |
marlin
|
1225c4dfd6
fix: illegal mem access crash for marlin
|
8 maanden geleden |
quip
|
aebd68c632
feat: backport kernels (#235)
|
1 jaar geleden |
squeezellm
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 jaar geleden |
quant_ops.cpp
|
f22b700ee4
feat: marlin kernels for GPTQ (#547)
|
7 maanden geleden |
quant_ops.h
|
c154578c97
gptq_marlin: 8bit GPTQ support
|
7 maanden geleden |