.. |
__init__.py
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 months ago |
loader.py
|
b3f9ab3b72
quant: add tensor parallel support for bitsandbytes (#1052)
|
1 week ago |
neuron.py
|
145e554a4d
neuron: add 8bit quantization for Neuron (#994)
|
2 weeks ago |
openvino.py
|
0dfa6b60ec
core: support logprobs with multi-step scheduling (#963)
|
2 weeks ago |
tensorizer.py
|
22a4cd4595
core: fix spec decode metrics and envs circular import (#889)
|
3 weeks ago |
utils.py
|
9f3e7c86e2
feat: add fused Marlin MoE kernel (#934)
|
2 weeks ago |
weight_utils.py
|
dcb36de9c4
quants: add support for NVIDIA's ModelOpt checkpoints (#1013)
|
1 week ago |