.. |
aqlm
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 months ago |
autoquant
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
awq
|
0256ed236b
feat: windows support (#790)
|
2 months ago |
compressed_tensors
|
8976805f90
kernel: asymmetric AQ AZP quantization kernels (#1048)
|
1 week ago |
cutlass_w8a8
|
a401f8e05d
feat: per-tensor token epilogue kernels (#630)
|
4 months ago |
exl2
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
fp6
|
73177656ed
feat: quant_llm support (#755)
|
3 months ago |
fp8
|
e14223dce5
kernel: use `cub::BlockReduce` instead of custom impl (#895)
|
3 weeks ago |
gguf
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 months ago |
gptq
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
gptq_marlin
|
a113309876
kernel: add meta functions for ops to prevent graph breaks (#1019)
|
1 week ago |
int8_kvcache
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 months ago |
machete
|
2a60b8f8c9
kernel: do not compile machete for cuda 11 and below (#901)
|
3 weeks ago |
marlin
|
0256ed236b
feat: windows support (#790)
|
2 months ago |
quip
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
squeezellm
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
quant_ops.h
|
8976805f90
kernel: asymmetric AQ AZP quantization kernels (#1048)
|
1 week ago |