.. |
common
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
10 months ago |
endpoints
|
e59dd4a90d
fix: openai gguf chat template (#312)
|
10 months ago |
engine
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
10 months ago |
kv_quant
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 months ago |
lora
|
a1d8ab9f3e
fix: lora on quantized models (barred gguf) (#292)
|
10 months ago |
modeling
|
968bde81bf
fix: tensor parallel with GPTQ and AWQ quants (#307)
|
10 months ago |
processing
|
c2d77b1822
chore: logging refactor (#302)
|
10 months ago |
task_handler
|
c2d77b1822
chore: logging refactor (#302)
|
10 months ago |
transformers_utils
|
e59dd4a90d
fix: openai gguf chat template (#312)
|
10 months ago |
__init__.py
|
ff898c2c80
bump version to 0.5.0 (#303)
|
10 months ago |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
1 year ago |