نویسنده | SHA1 پیام | تاریخ |
---|---|---|
|
db73f03cdc fix: use ParallelLMHead for MLPSpeculator | 5 ماه پیش |
|
0f4a9ee77b quantized lm_head (#582) | 5 ماه پیش |
|
de7e6919c0 feat: support tied weights and input scale for MLPSpeculator | 6 ماه پیش |
|
51cfadeb29 fix: `MLPSpeculator` handling of `num_speculative_tokens` | 6 ماه پیش |
|
af43576da0 feat: add MLPSpeculator speculative decoding support (#572) | 6 ماه پیش |