.. |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
block_table.py
|
79d603954e
fix: chunked prefill with v2 block manager (#679)
|
3 months ago |
common.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 month ago |
cpu_gpu_block_allocator.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 month ago |
interfaces.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 month ago |
naive_block.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 month ago |
prefix_caching_block.py
|
9c9b2dd843
core: improve warmup times for prefix caching in block manager v2 (#920)
|
3 weeks ago |
utils.py
|
a0e446a17d
feat: initial encoder-decoder support with BART model (#633)
|
4 months ago |