.. |
__init__.py
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
5 months ago |
arctic.py
|
1e35cef979
feat: add arctic snowflake model (#551)
|
6 months ago |
chatglm.py
|
9e73559eba
make use of batched rotary embedding kernels to support long context lora
|
6 months ago |
dbrx.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |
falcon.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |
jais.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |
medusa.py
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
5 months ago |
mlp_speculator.py
|
de7e6919c0
feat: support tied weights and input scale for MLPSpeculator
|
5 months ago |
mpt.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |