AlpinDale
|
9a7d5514c4
feat: introduce MQAphroditeEngine (#1056)
|
1 week ago |
AlpinDale
|
fe01e2ded8
chore: move `device` keys to a constant (#1020)
|
2 weeks ago |
AlpinDale
|
231693151b
benchmarks: add `--async-engine` arg to throughput benchmark (#988)
|
2 weeks ago |
AlpinDale
|
f7f3fed265
feat: add async postprocessor (#925)
|
2 weeks ago |
AlpinDale
|
132aa2abe4
spec decode: add support for EAGLE (#899)
|
3 weeks ago |
AlpinDale
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 month ago |
AlpinDale
|
bfc8988116
feat: add cuda sampling kernels for top_k and top_p (#828)
|
1 month ago |
Pyroserenus
|
ee5964465d
chore: max_num_seqs in throughput benchmark (#770)
|
3 months ago |
AlpinDale
|
73177656ed
feat: quant_llm support (#755)
|
3 months ago |
AlpinDale
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |