AlpinDale
|
3a53ff1e01
fix: raise an error for no draft token case when draft_tp>1
|
hace 5 meses |
AlpinDale
|
16dff9babc
chore: enable bonus token in spec decoding for KV cache based models
|
hace 5 meses |
AlpinDale
|
af43576da0
feat: add MLPSpeculator speculative decoding support (#572)
|
hace 5 meses |
AlpinDale
|
e0886ee929
feat: add `ProposerWorkerBase` abstract class
|
hace 6 meses |
AlpinDale
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
hace 6 meses |
AlpinDale
|
79901b76de
logprobs for target model (spec decoding)
|
hace 6 meses |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
hace 8 meses |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
hace 10 meses |