.. |
layers
|
968bde81bf
fix: tensor parallel with GPTQ and AWQ quants (#307)
|
vor 10 Monaten |
megatron
|
c2d77b1822
chore: logging refactor (#302)
|
vor 10 Monaten |
models
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
vor 10 Monaten |
__init__.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
vor 11 Monaten |
hf_downloader.py
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
vor 10 Monaten |
loader.py
|
c2d77b1822
chore: logging refactor (#302)
|
vor 10 Monaten |
metadata.py
|
9810daa699
feat: INT8 KV Cache (#298)
|
vor 10 Monaten |
outlines_decoding.py
|
657aec0cbd
refactor: OpenAI endpoint (#261)
|
vor 10 Monaten |
outlines_logits_processors.py
|
657aec0cbd
refactor: OpenAI endpoint (#261)
|
vor 10 Monaten |
sampling_metadata.py
|
9fa99215f8
feat: add cubic sampling (#280)
|
vor 10 Monaten |
utils.py
|
2755a48d51
merge dev branch into main (#153)
|
vor 1 Jahr |