.. |
__init__.py
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 mēneši atpakaļ |
loader.py
|
f2b6dc3872
cpu: add support for W8A8 quantization via compressed-tensor (#1017)
|
2 nedēļas atpakaļ |
neuron.py
|
145e554a4d
neuron: add 8bit quantization for Neuron (#994)
|
2 nedēļas atpakaļ |
openvino.py
|
0dfa6b60ec
core: support logprobs with multi-step scheduling (#963)
|
2 nedēļas atpakaļ |
tensorizer.py
|
22a4cd4595
core: fix spec decode metrics and envs circular import (#889)
|
3 nedēļas atpakaļ |
utils.py
|
9f3e7c86e2
feat: add fused Marlin MoE kernel (#934)
|
2 nedēļas atpakaļ |
weight_utils.py
|
dcb36de9c4
quants: add support for NVIDIA's ModelOpt checkpoints (#1013)
|
2 nedēļas atpakaļ |