.. |
__init__.py
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
пре 9 месеци |
cpu_executor.py
|
60ca1e1e5e
feat: add ngram prompt lookup decoding for speculative decoding (#438)
|
пре 8 месеци |
executor_base.py
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
пре 8 месеци |
gpu_executor.py
|
60ca1e1e5e
feat: add ngram prompt lookup decoding for speculative decoding (#438)
|
пре 8 месеци |
neuron_executor.py
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
пре 8 месеци |
ray_gpu_executor.py
|
60ca1e1e5e
feat: add ngram prompt lookup decoding for speculative decoding (#438)
|
пре 8 месеци |