.. |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
il y a 8 mois |
block_table.py
|
79d603954e
fix: chunked prefill with v2 block manager (#679)
|
il y a 4 mois |
common.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
il y a 1 mois |
cpu_gpu_block_allocator.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
il y a 1 mois |
interfaces.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
il y a 1 mois |
naive_block.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
il y a 1 mois |
prefix_caching_block.py
|
9c9b2dd843
core: improve warmup times for prefix caching in block manager v2 (#920)
|
il y a 3 semaines |
utils.py
|
a0e446a17d
feat: initial encoder-decoder support with BART model (#633)
|
il y a 4 mois |