AlpinDale
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
vor 5 Monaten |
AlpinDale
|
e42d0b3455
possibly improve ngram efficiency
|
vor 5 Monaten |
AlpinDale
|
16f345c29a
fix circular reference with weakref
|
vor 5 Monaten |
AlpinDale
|
6e63d7a9db
fix memory usage with ngram spec decoding
|
vor 5 Monaten |
AlpinDale
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
vor 5 Monaten |
AlpinDale
|
79901b76de
logprobs for target model (spec decoding)
|
vor 5 Monaten |
AlpinDale
|
723c6acb84
re-add ngram speculative decoding
|
vor 5 Monaten |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 8 Monaten |