AlpinDale
|
2c653a2268
fix: make speculative decoding work with per-request seed
|
há 5 meses atrás |
AlpinDale
|
af43576da0
feat: add MLPSpeculator speculative decoding support (#572)
|
há 5 meses atrás |
AlpinDale
|
4d1e613804
chore: minor simplifications
|
há 5 meses atrás |
AlpinDale
|
e0886ee929
feat: add `ProposerWorkerBase` abstract class
|
há 6 meses atrás |
AlpinDale
|
a94de94c44
refactor: combine the prefill and decode into a single API (#553)
|
há 6 meses atrás |
AlpinDale
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
há 6 meses atrás |
AlpinDale
|
79901b76de
logprobs for target model (spec decoding)
|
há 6 meses atrás |
AlpinDale
|
723c6acb84
re-add ngram speculative decoding
|
há 6 meses atrás |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
há 8 meses atrás |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
há 10 meses atrás |