.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
il y a 1 an |
block.py
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
il y a 4 mois |
config.py
|
867939a6db
bring back cuda kernels for lroa
|
il y a 3 mois |
connections.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
il y a 4 mois |
grammar.py
|
0527131e93
fix: grammar logits processor (#268)
|
il y a 10 mois |
logger.py
|
867939a6db
bring back cuda kernels for lroa
|
il y a 3 mois |
logits_processor.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
il y a 8 mois |
outputs.py
|
0e558e9b2f
fix: loading chameleon model with TP>1 (#695)
|
il y a 4 mois |
pooling_params.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
il y a 4 mois |
sampling_params.py
|
ad181e3fef
feat: bring back dynatemp (#754)
|
il y a 3 mois |
sequence.py
|
577586309d
chore: multi-step args and sequence modifications (#713)
|
il y a 3 mois |
test_utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
il y a 8 mois |
utils.py
|
0b8b407b6d
feat: support profiling with multiple multi-modal inputs per prompt (#712)
|
il y a 3 mois |