.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |
block.py
|
fad45609b8
chore: remove logical token blocks (turns out they are not needed)
|
7 months ago |
config.py
|
ddb28a80a3
fix: bump torch for rocm, unify CUDA_VISIBLE_DEVICES for cuda and rocm
|
6 months ago |
grammar.py
|
0527131e93
fix: grammar logits processor (#268)
|
1 year ago |
logger.py
|
46159b107a
formatting: pt1
|
8 months ago |
logits_processor.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
10 months ago |
outputs.py
|
63b735bc2a
chore: optimize v2 block manager to match the performance of v1
|
7 months ago |
pooling_params.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
7 months ago |
sampling_params.py
|
b3643a7bd7
fix: min_tokens for when there are multiple eos tokens
|
7 months ago |
sequence.py
|
16dff9babc
chore: enable bonus token in spec decoding for KV cache based models
|
6 months ago |
test_utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
10 months ago |
utils.py
|
ddb28a80a3
fix: bump torch for rocm, unify CUDA_VISIBLE_DEVICES for cuda and rocm
|
6 months ago |