.. |
__init__.py
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
hai 5 meses |
arctic.py
|
1e35cef979
feat: add arctic snowflake model (#551)
|
hai 6 meses |
chatglm.py
|
9e73559eba
make use of batched rotary embedding kernels to support long context lora
|
hai 6 meses |
dbrx.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 6 meses |
falcon.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 6 meses |
jais.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 6 meses |
medusa.py
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
hai 5 meses |
mlp_speculator.py
|
de7e6919c0
feat: support tied weights and input scale for MLPSpeculator
|
hai 5 meses |
mpt.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
hai 6 meses |