AlpinDale 17eb1b7eb9 chore: remove ray health check 7 months ago
..
__init__.py f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 11 months ago
cpu_executor.py ef733aee43 implement ExecuteModelData to reduce executor complexity 7 months ago
distributed_gpu_executor.py de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 7 months ago
executor_base.py de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 7 months ago
gpu_executor.py 236be273e5 feat: tensor parallel speculative decoding (#554) 7 months ago
multiproc_gpu_executor.py 05d6e43244 fix: `torch.compile()` with mp executor backend 7 months ago
multiproc_worker_utils.py eaa06fdd14 fix some f-strings 7 months ago
neuron_executor.py ef733aee43 implement ExecuteModelData to reduce executor complexity 7 months ago
ray_gpu_executor.py 17eb1b7eb9 chore: remove ray health check 7 months ago
ray_utils.py c6a501f682 add multiprocessing executor; make ray optional 7 months ago