AlpinDale 6cecbbff6a fix: reduce memory footprint of cuda graph by adding output buffer 7 kuukautta sitten
..
__init__.py 04b53d2db5 chore: add initializer files 1 vuosi sitten
cache_engine.py f40b809d3b allow using v2 block manager with sliding window 7 kuukautta sitten
cpu_model_runner.py 8d77c69cbd feat: support image processor and add llava example 7 kuukautta sitten
cpu_worker.py 50b7c13db0 refactor: attention selector (#552) 7 kuukautta sitten
embedding_model_runner.py 8d77c69cbd feat: support image processor and add llava example 7 kuukautta sitten
model_runner.py 6cecbbff6a fix: reduce memory footprint of cuda graph by adding output buffer 7 kuukautta sitten
neuron_model_runner.py 35ae01d7ba refactor: attention metadata term 8 kuukautta sitten
neuron_worker.py fca911ee0a vLLM Upstream Sync (#526) 8 kuukautta sitten
worker.py eb2c5c77df feat: enforce the max possible seqlen 7 kuukautta sitten
worker_base.py 7194047318 remove vllm-nccl 7 kuukautta sitten