.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 rok pred |
block.py
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
4 mesiacov pred |
config.py
|
867939a6db
bring back cuda kernels for lroa
|
3 mesiacov pred |
connections.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 mesiacov pred |
grammar.py
|
0527131e93
fix: grammar logits processor (#268)
|
10 mesiacov pred |
logger.py
|
867939a6db
bring back cuda kernels for lroa
|
3 mesiacov pred |
logits_processor.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 mesiacov pred |
outputs.py
|
0e558e9b2f
fix: loading chameleon model with TP>1 (#695)
|
4 mesiacov pred |
pooling_params.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 mesiacov pred |
sampling_params.py
|
ad181e3fef
feat: bring back dynatemp (#754)
|
3 mesiacov pred |
sequence.py
|
577586309d
chore: multi-step args and sequence modifications (#713)
|
3 mesiacov pred |
test_utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 mesiacov pred |
utils.py
|
0b8b407b6d
feat: support profiling with multiple multi-modal inputs per prompt (#712)
|
3 mesiacov pred |