.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |
block.py
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
10 months ago |
config.py
|
ac79d115b3
add guards for prefix caching, fp8, chunked, etc
|
5 months ago |
grammar.py
|
0527131e93
fix: grammar logits processor (#268)
|
10 months ago |
logger.py
|
46159b107a
formatting: pt1
|
6 months ago |
logits_processor.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
outputs.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
5 months ago |
pooling_params.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
5 months ago |
sampling_params.py
|
e8b7f53321
allow prompt token IDs in the logits processor api
|
5 months ago |
sequence.py
|
a94de94c44
refactor: combine the prefill and decode into a single API (#553)
|
5 months ago |
test_utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
utils.py
|
656459fd84
make fp8_e4m3 work on nvidia
|
5 months ago |