.. |
__init__.py
|
bd0ddf1cfe
feat: EETQ quantization (#408)
|
9 months ago |
aqlm.py
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
10 months ago |
awq.py
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
10 months ago |
base_config.py
|
aa244761ed
formatting and typing
|
10 months ago |
bitsandbytes.py
|
fa083286e3
Speculative Decoding Part 4: Lookahead scheduling (#402)
|
9 months ago |
eetq.py
|
8d26cf3876
simplify model_executor logic
|
9 months ago |
exl2.py
|
ea26c91e52
proper typing
|
10 months ago |
gguf.py
|
ea26c91e52
proper typing
|
10 months ago |
gptq.py
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
10 months ago |
hadamard.safetensors
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 year ago |
marlin.py
|
ea26c91e52
proper typing
|
10 months ago |
quip.py
|
ea26c91e52
proper typing
|
10 months ago |
quip_utils.py
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
10 months ago |
schema.py
|
7528e0ce3e
make detokenization optional
|
9 months ago |
squeezellm.py
|
ea26c91e52
proper typing
|
10 months ago |