AlpinDale 60ca1e1e5e feat: add ngram prompt lookup decoding for speculative decoding (#438) 8 месяцев назад
..
__init__.py f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 9 месяцев назад
cpu_executor.py 60ca1e1e5e feat: add ngram prompt lookup decoding for speculative decoding (#438) 8 месяцев назад
executor_base.py d8c4193704 feat: Speculative Decoding using a draft model (#432) 8 месяцев назад
gpu_executor.py 60ca1e1e5e feat: add ngram prompt lookup decoding for speculative decoding (#438) 8 месяцев назад
neuron_executor.py d8c4193704 feat: Speculative Decoding using a draft model (#432) 8 месяцев назад
ray_gpu_executor.py 60ca1e1e5e feat: add ngram prompt lookup decoding for speculative decoding (#438) 8 месяцев назад