.. |
aqlm
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 ماه پیش |
autoquant
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
awq
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
compressed_tensors
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
cutlass_w8a8
|
a401f8e05d
feat: per-tensor token epilogue kernels (#630)
|
4 ماه پیش |
exl2
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
fp8
|
b0f262eec1
feat: FP8 quantization support for AMD ROCm (#729)
|
3 ماه پیش |
gguf
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 ماه پیش |
gptq
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
gptq_marlin
|
6144150398
chore: use scalar type to dispatch to different `gptq_marlin` kernels (#689)
|
4 ماه پیش |
int8_kvcache
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 ماه پیش |
marlin
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
quip
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
squeezellm
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
quant_ops.h
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 ماه پیش |