AlpinDale 34b41e0a87 chore: add coordinator to reduce code duplication in tp and pp 7 месяцев назад
..
__init__.py 04b53d2db5 chore: add initializer files 1 год назад
cache_engine.py fe21123a1c feat: TPU support (#570) 7 месяцев назад
cpu_model_runner.py fdabb55a4d fix: wrong multi_modal_input format for CPU 7 месяцев назад
cpu_worker.py 50b7c13db0 refactor: attention selector (#552) 8 месяцев назад
embedding_model_runner.py 8d77c69cbd feat: support image processor and add llava example 7 месяцев назад
model_runner.py 34b41e0a87 chore: add coordinator to reduce code duplication in tp and pp 7 месяцев назад
neuron_model_runner.py 35ae01d7ba refactor: attention metadata term 8 месяцев назад
neuron_worker.py fca911ee0a vLLM Upstream Sync (#526) 8 месяцев назад
tpu_model_runner.py fe21123a1c feat: TPU support (#570) 7 месяцев назад
tpu_worker.py a524667db0 fix: device assertion for sdpa backend; fix env for tpu worker 7 месяцев назад
worker.py d0cca80b8b feat: support sharded tensorizer models 7 месяцев назад
worker_base.py 7194047318 remove vllm-nccl 7 месяцев назад