.. |
gguf_utils
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 10 Monaten |
__init__.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
aqlm.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
awq.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
base_config.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
bitsandbytes.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
deepspeedfp.py
|
4acf34417a
feat: add DeepSpeedFP quantization for all models
|
vor 8 Monaten |
eetq.py
|
b178ae4b4a
chore: generalize linear_method to be quant_method (#540)
|
vor 8 Monaten |
exl2.py
|
b178ae4b4a
chore: generalize linear_method to be quant_method (#540)
|
vor 8 Monaten |
fp8.py
|
c4c153863e
improve fp8 linear layer performance
|
vor 8 Monaten |
gguf.py
|
b178ae4b4a
chore: generalize linear_method to be quant_method (#540)
|
vor 8 Monaten |
gptq.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
gptq_marlin.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
gptq_marlin_24.py
|
8e11259e90
missing triton autoconfig for rocm flash attn
|
vor 7 Monaten |
hadamard.safetensors
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 10 Monaten |
marlin.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
quip.py
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
vor 7 Monaten |
quip_utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 10 Monaten |
schema.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 10 Monaten |
squeezellm.py
|
b178ae4b4a
chore: generalize linear_method to be quant_method (#540)
|
vor 8 Monaten |