.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 سال پیش |
block.py
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
4 ماه پیش |
config.py
|
b3f9ab3b72
quant: add tensor parallel support for bitsandbytes (#1052)
|
3 هفته پیش |
connections.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 ماه پیش |
envs.py
|
ddaefd8d38
chore: remove engine_use_ray (#1024)
|
4 هفته پیش |
grammar.py
|
8a71788372
Add OLMoE (#772)
|
3 ماه پیش |
logger.py
|
22a4cd4595
core: fix spec decode metrics and envs circular import (#889)
|
1 ماه پیش |
logits_processor.py
|
8a71788372
Add OLMoE (#772)
|
3 ماه پیش |
outputs.py
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
4 هفته پیش |
pooling_params.py
|
2f61644f6e
SPMD optimizations (#824)
|
2 ماه پیش |
sampling_params.py
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
4 هفته پیش |
sequence.py
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
4 هفته پیش |
test_utils.py
|
8a71788372
Add OLMoE (#772)
|
3 ماه پیش |
utils.py
|
9bdf8d5bfa
mamba: enable continuous batching for mamba kernels (#1055)
|
3 هفته پیش |