.. |
__init__.py
|
04b53d2db5
chore: add initializer files
|
1 жил өмнө |
cache_engine.py
|
bf88c8567e
feat: mamba model support (#674)
|
4 сар өмнө |
cpu_model_runner.py
|
55b7ce56c1
cpu: fix `mm_limits` initialization (#873)
|
1 сар өмнө |
cpu_worker.py
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 сар өмнө |
embedding_model_runner.py
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 сар өмнө |
enc_dec_model_runner.py
|
1405051912
attention: add `AttentionState` abstraction (#863)
|
1 сар өмнө |
model_runner.py
|
abfd4465ca
feat: add support for chunked prefill + prefix caching (#871)
|
1 сар өмнө |
model_runner_base.py
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 сар өмнө |
multi_step_model_runner.py
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 сар өмнө |
multi_step_worker.py
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 сар өмнө |
neuron_model_runner.py
|
008e646c7e
chore: add support for up to 2048 block size (#715)
|
3 сар өмнө |
neuron_worker.py
|
008e646c7e
chore: add support for up to 2048 block size (#715)
|
3 сар өмнө |
openvino_model_runner.py
|
bf88c8567e
feat: mamba model support (#674)
|
4 сар өмнө |
openvino_worker.py
|
bf88c8567e
feat: mamba model support (#674)
|
4 сар өмнө |
tpu_model_runner.py
|
81c5f196eb
chore: various TPU fixes and optimizations (#746)
|
3 сар өмнө |
tpu_worker.py
|
81c5f196eb
chore: various TPU fixes and optimizations (#746)
|
3 сар өмнө |
utils.py
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 сар өмнө |
worker.py
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 сар өмнө |
worker_base.py
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 сар өмнө |
xpu_model_runner.py
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 сар өмнө |
xpu_worker.py
|
9094a8a2a3
xpu: refactor XPU worker & executor (#861)
|
1 сар өмнө |