.. |
__init__.py
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
9 ay önce |
cpu_executor.py
|
60ca1e1e5e
feat: add ngram prompt lookup decoding for speculative decoding (#438)
|
8 ay önce |
executor_base.py
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
8 ay önce |
gpu_executor.py
|
60ca1e1e5e
feat: add ngram prompt lookup decoding for speculative decoding (#438)
|
8 ay önce |
neuron_executor.py
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
8 ay önce |
ray_gpu_executor.py
|
60ca1e1e5e
feat: add ngram prompt lookup decoding for speculative decoding (#438)
|
8 ay önce |