.. |
__init__.py
|
04b53d2db5
chore: add initializer files
|
преди 1 година |
cache_engine.py
|
50b7c13db0
refactor: attention selector (#552)
|
преди 6 месеца |
cpu_model_runner.py
|
f6250c5516
move dockerfiles to root; fix cpu build
|
преди 6 месеца |
cpu_worker.py
|
50b7c13db0
refactor: attention selector (#552)
|
преди 6 месеца |
embedding_model_runner.py
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
преди 6 месеца |
model_runner.py
|
5b0c11d190
support pipeline parallel pynccl groups
|
преди 6 месеца |
neuron_model_runner.py
|
35ae01d7ba
refactor: attention metadata term
|
преди 6 месеца |
neuron_worker.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
преди 7 месеца |
worker.py
|
eb2c5c77df
feat: enforce the max possible seqlen
|
преди 6 месеца |
worker_base.py
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
преди 6 месеца |