AlpinDale
|
4d4e767838
ci: take one of fixing lint issues
|
4 months ago |
AlpinDale
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
5 months ago |
AlpinDale
|
63b735bc2a
chore: optimize v2 block manager to match the performance of v1
|
5 months ago |
AlpinDale
|
237fa59aea
feat: support CPU/GPU swapping in BlockManagerV2
|
5 months ago |
AlpinDale
|
9099040472
feat: cross-attention kv caching support
|
5 months ago |
AlpinDale
|
8b56dc4347
dict -> torch.Tensor for blocks_to_swap
|
6 months ago |
AlpinDale
|
148aca8ff1
cow => dict[int, list] -> list
|
6 months ago |
AlpinDale
|
25c2b6feca
ignore infeasible swap requests
|
6 months ago |
AlpinDale
|
6f6bf568e5
enable prefix caching with v2 block manager for spec decoding
|
6 months ago |
AlpinDale
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |