.. |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 сар өмнө |
batch_expansion.py
|
a94de94c44
refactor: combine the prefill and decode into a single API (#553)
|
5 сар өмнө |
interfaces.py
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
5 сар өмнө |
metrics.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 сар өмнө |
multi_step_worker.py
|
a94de94c44
refactor: combine the prefill and decode into a single API (#553)
|
5 сар өмнө |
ngram_worker.py
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
5 сар өмнө |
spec_decode_worker.py
|
344ddaac5a
properly disable speculative decoding
|
5 сар өмнө |
top1_proposer.py
|
e42d0b3455
possibly improve ngram efficiency
|
5 сар өмнө |
util.py
|
5b0c11d190
support pipeline parallel pynccl groups
|
5 сар өмнө |