.. |
block
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
8 сар өмнө |
__init__.py
|
ac1d46a2ec
feat: begin work on the engine
|
1 жил өмнө |
block_manager_v1.py
|
f52aa64fe6
use the get_len() method instead of manual len calculation
|
9 сар өмнө |
block_manager_v2.py
|
fa083286e3
Speculative Decoding Part 4: Lookahead scheduling (#402)
|
9 сар өмнө |
evictor.py
|
375f24ccca
fix: optimize context shift performance (#380)
|
9 сар өмнө |
interfaces.py
|
fa083286e3
Speculative Decoding Part 4: Lookahead scheduling (#402)
|
9 сар өмнө |
policy.py
|
6f00203041
refactor scheduler for chunked prefill, remove reorder policy for now
|
9 сар өмнө |
scheduler.py
|
c577c31aaa
feat: tree attention
|
8 сар өмнө |