.. |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 meses atrás |
block_table.py
|
79d603954e
fix: chunked prefill with v2 block manager (#679)
|
4 meses atrás |
common.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 mês atrás |
cpu_gpu_block_allocator.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 mês atrás |
interfaces.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 mês atrás |
naive_block.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 mês atrás |
prefix_caching_block.py
|
9c9b2dd843
core: improve warmup times for prefix caching in block manager v2 (#920)
|
3 semanas atrás |
utils.py
|
a0e446a17d
feat: initial encoder-decoder support with BART model (#633)
|
4 meses atrás |