.. |
compressed_tensors
|
201db10f02
models: add support for Phi3 MoE
|
2 周之前 |
gguf_utils
|
8a71788372
Add OLMoE (#772)
|
2 月之前 |
kernels
|
f7f3fed265
feat: add async postprocessor (#925)
|
2 周之前 |
utils
|
93bc863591
feat: Machete Kernels for Hopper GPUs (#842)
|
1 月之前 |
__init__.py
|
f98e7b2f8c
feat: add HQQ quantization support (#795)
|
2 月之前 |
aqlm.py
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 月之前 |
autoquant.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
awq.py
|
edec2e9a9e
feat: migrate awq and awq_marlin to AphroditeParameter (#702)
|
4 月之前 |
awq_marlin.py
|
93bc863591
feat: Machete Kernels for Hopper GPUs (#842)
|
1 月之前 |
awq_triton.py
|
fcfcfc65e1
quants: add triton kernels for AWQ (#946)
|
2 周之前 |
base_config.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
bitsandbytes.py
|
6bdff60aab
quant: support pre-quanted bitsandbytes checkpoints (#961)
|
2 周之前 |
deepspeedfp.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
eetq.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
exl2.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
experts_int8.py
|
201db10f02
models: add support for Phi3 MoE
|
2 周之前 |
fbgemm_fp8.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
fp6.py
|
73177656ed
feat: quant_llm support (#755)
|
3 月之前 |
fp8.py
|
201db10f02
models: add support for Phi3 MoE
|
2 周之前 |
gguf.py
|
0dfa6b60ec
core: support logprobs with multi-step scheduling (#963)
|
2 周之前 |
gptq.py
|
ccbda97416
fix: types in AQLM and GGUF for dynamo support (#736)
|
3 月之前 |
gptq_marlin.py
|
93bc863591
feat: Machete Kernels for Hopper GPUs (#842)
|
1 月之前 |
gptq_marlin_24.py
|
5d9021969c
quants: update `qqq` and `gptq_marlin_24` to use AphroditeParameters (#921)
|
3 周之前 |
hadamard.safetensors
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 月之前 |
hqq_marlin.py
|
f98e7b2f8c
feat: add HQQ quantization support (#795)
|
2 月之前 |
kv_cache.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
marlin.py
|
799667737b
quantization: update marlin to use `AphroditeParameters` (#913)
|
3 周之前 |
qqq.py
|
5d9021969c
quants: update `qqq` and `gptq_marlin_24` to use AphroditeParameters (#921)
|
3 周之前 |
quip.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
quip_utils.py
|
8a71788372
Add OLMoE (#772)
|
2 月之前 |
schema.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 月之前 |
squeezellm.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 月之前 |
tpu_int8.py
|
f4b62bf803
quant: update tpu_int8 to use AphroditeParameters (#959)
|
2 周之前 |