.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |
block.py
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
4 months ago |
config.py
|
483c9e6e59
fix: disable awq_marlin override for awq models (#843)
|
1 month ago |
connections.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
grammar.py
|
8a71788372
Add OLMoE (#772)
|
2 months ago |
logger.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
logits_processor.py
|
8a71788372
Add OLMoE (#772)
|
2 months ago |
outputs.py
|
0e558e9b2f
fix: loading chameleon model with TP>1 (#695)
|
4 months ago |
pooling_params.py
|
2f61644f6e
SPMD optimizations (#824)
|
1 month ago |
sampling_params.py
|
72c0584e14
sampler: add range parameter for DRY
|
1 month ago |
sequence.py
|
48a8693aed
feat: multi-step scheduling (#831)
|
1 month ago |
test_utils.py
|
8a71788372
Add OLMoE (#772)
|
2 months ago |
utils.py
|
0f1af04cf5
frontend: minor logging improvements (#787)
|
2 months ago |