Autor | SHA1 Mensaxe | Data |
---|---|---|
|
db73f03cdc fix: use ParallelLMHead for MLPSpeculator | hai 5 meses |
|
0f4a9ee77b quantized lm_head (#582) | hai 5 meses |
|
de7e6919c0 feat: support tied weights and input scale for MLPSpeculator | hai 6 meses |
|
51cfadeb29 fix: `MLPSpeculator` handling of `num_speculative_tokens` | hai 6 meses |
|
af43576da0 feat: add MLPSpeculator speculative decoding support (#572) | hai 6 meses |