.. |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
10 mesi fa |
batch_expansion.py
|
af43576da0
feat: add MLPSpeculator speculative decoding support (#572)
|
7 mesi fa |
draft_model_runner.py
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
7 mesi fa |
interfaces.py
|
af43576da0
feat: add MLPSpeculator speculative decoding support (#572)
|
7 mesi fa |
metrics.py
|
7253e9052d
feat: integrate typical acceptance sampling for spec decoding
|
7 mesi fa |
mlp_speculator_worker.py
|
405bb74612
Control plane comms refactor (#573)
|
7 mesi fa |
multi_step_worker.py
|
cdff8e89f9
feat: introduce `DraftModelRunner`
|
7 mesi fa |
ngram_worker.py
|
e0886ee929
feat: add `ProposerWorkerBase` abstract class
|
7 mesi fa |
proposer_worker_base.py
|
abbb730607
feat: support draft model on different tensor parallel size
|
7 mesi fa |
smaller_tp_proposer_worker.py
|
b6ff0623a6
chore: clean up branding
|
7 mesi fa |
spec_decode_worker.py
|
dd378ea063
feat: MLPSpeculator with tensor parallel
|
7 mesi fa |
top1_proposer.py
|
af43576da0
feat: add MLPSpeculator speculative decoding support (#572)
|
7 mesi fa |
util.py
|
af43576da0
feat: add MLPSpeculator speculative decoding support (#572)
|
7 mesi fa |