AlpinDale f2b6dc3872 cpu: add support for W8A8 quantization via compressed-tensor (#1017) 2 nedēļas atpakaļ
..
__init__.py 89a2c6dee1 chore: refactor `MultiModalConfig` initialization and profiling (#745) 3 mēneši atpakaļ
loader.py f2b6dc3872 cpu: add support for W8A8 quantization via compressed-tensor (#1017) 2 nedēļas atpakaļ
neuron.py 145e554a4d neuron: add 8bit quantization for Neuron (#994) 2 nedēļas atpakaļ
openvino.py 0dfa6b60ec core: support logprobs with multi-step scheduling (#963) 2 nedēļas atpakaļ
tensorizer.py 22a4cd4595 core: fix spec decode metrics and envs circular import (#889) 3 nedēļas atpakaļ
utils.py 9f3e7c86e2 feat: add fused Marlin MoE kernel (#934) 2 nedēļas atpakaļ
weight_utils.py dcb36de9c4 quants: add support for NVIDIA's ModelOpt checkpoints (#1013) 2 nedēļas atpakaļ