.. |
__init__.py
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
преди 9 месеца |
cpu_executor.py
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
преди 6 месеца |
distributed_gpu_executor.py
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
преди 5 месеца |
executor_base.py
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
преди 5 месеца |
gpu_executor.py
|
236be273e5
feat: tensor parallel speculative decoding (#554)
|
преди 6 месеца |
multiproc_gpu_executor.py
|
5b0c11d190
support pipeline parallel pynccl groups
|
преди 5 месеца |
multiproc_worker_utils.py
|
eaa06fdd14
fix some f-strings
|
преди 6 месеца |
neuron_executor.py
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
преди 6 месеца |
ray_gpu_executor.py
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
преди 5 месеца |
ray_utils.py
|
c6a501f682
add multiprocessing executor; make ray optional
|
преди 6 месеца |