AlpinDale de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 7 months ago
..
__init__.py 9d81716bfd [v0.5.3] Release Candidate (#388) 10 months ago
batch_expansion.py a94de94c44 refactor: combine the prefill and decode into a single API (#553) 7 months ago
interfaces.py ef733aee43 implement ExecuteModelData to reduce executor complexity 7 months ago
metrics.py 9d81716bfd [v0.5.3] Release Candidate (#388) 10 months ago
multi_step_worker.py a94de94c44 refactor: combine the prefill and decode into a single API (#553) 7 months ago
ngram_worker.py de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 7 months ago
spec_decode_worker.py de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 7 months ago
top1_proposer.py e42d0b3455 possibly improve ngram efficiency 7 months ago
util.py be8154a8a0 feat: proper embeddings API with e5-mistral-7b support 7 months ago