.. |
block
|
f40b809d3b
allow using v2 block manager with sliding window
|
7 月之前 |
__init__.py
|
ac1d46a2ec
feat: begin work on the engine
|
1 年之前 |
block_manager_v1.py
|
8b56dc4347
dict -> torch.Tensor for blocks_to_swap
|
7 月之前 |
block_manager_v2.py
|
8b56dc4347
dict -> torch.Tensor for blocks_to_swap
|
7 月之前 |
embedding_model_block_manager.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
7 月之前 |
evictor_v1.py
|
6f6bf568e5
enable prefix caching with v2 block manager for spec decoding
|
7 月之前 |
evictor_v2.py
|
25c2b6feca
ignore infeasible swap requests
|
7 月之前 |
interfaces.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
7 月之前 |
policy.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
8 月之前 |
scheduler.py
|
5b0c11d190
support pipeline parallel pynccl groups
|
7 月之前 |