.. |
__init__.py
|
04b53d2db5
chore: add initializer files
|
vor 1 Jahr |
cache_engine.py
|
f40b809d3b
allow using v2 block manager with sliding window
|
vor 7 Monaten |
cpu_model_runner.py
|
8d77c69cbd
feat: support image processor and add llava example
|
vor 7 Monaten |
cpu_worker.py
|
50b7c13db0
refactor: attention selector (#552)
|
vor 7 Monaten |
embedding_model_runner.py
|
8d77c69cbd
feat: support image processor and add llava example
|
vor 7 Monaten |
model_runner.py
|
c975bba905
fix: sharded state loader with lora
|
vor 7 Monaten |
neuron_model_runner.py
|
35ae01d7ba
refactor: attention metadata term
|
vor 8 Monaten |
neuron_worker.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
vor 8 Monaten |
worker.py
|
eb2c5c77df
feat: enforce the max possible seqlen
|
vor 7 Monaten |
worker_base.py
|
7194047318
remove vllm-nccl
|
vor 7 Monaten |