.. |
rpc
|
53d0ba7c7c
api: add endpoint for loading and unloading the model (#926)
|
il y a 1 mois |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
il y a 1 an |
api_server.py
|
a3c03db735
fix: inline model loading conflicts with lora (#930)
|
il y a 1 mois |
args.py
|
d46e70ac98
api: add inline model loading (#928)
|
il y a 1 mois |
logits_processors.py
|
62111fab17
feat: allow serving encoder-decoder models in the API server (#664)
|
il y a 4 mois |
protocol.py
|
f61acdd3ec
api: add json_schema to OpenAI server (#915)
|
il y a 1 mois |
run_batch.py
|
81fa31bcaf
feat: embeddings support for batched OAI endpoint (#676)
|
il y a 4 mois |
samplers.json
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
il y a 10 mois |
serving_chat.py
|
61c7182491
feat: enable prompt logprobs in OpenAI API (#720)
|
il y a 4 mois |
serving_completions.py
|
61c7182491
feat: enable prompt logprobs in OpenAI API (#720)
|
il y a 4 mois |
serving_embedding.py
|
0c162c8dad
api: use fp32 for base64 embeddings (#919)
|
il y a 1 mois |
serving_engine.py
|
1d3a1fec47
feat: add load/unload endpoints for soft-prompts (#694)
|
il y a 4 mois |
serving_tokenization.py
|
3648170750
fix: gracefully handle missing chat template (#642)
|
il y a 4 mois |