.. |
rpc
|
638c08d9dc
fix: clean shutdown issues (#1047)
|
4 semanas atrás |
tool_parsers
|
a56bce4c94
fix: remove duplicate assignment in Hermes2ProToolParser
|
4 semanas atrás |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 ano atrás |
api_server.py
|
638c08d9dc
fix: clean shutdown issues (#1047)
|
4 semanas atrás |
args.py
|
313e198557
api: implement OpenAI-compatible tools API for Hermes/Mistral models (#993)
|
1 mês atrás |
logits_processors.py
|
62111fab17
feat: allow serving encoder-decoder models in the API server (#664)
|
4 meses atrás |
protocol.py
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
4 semanas atrás |
run_batch.py
|
81fa31bcaf
feat: embeddings support for batched OAI endpoint (#676)
|
4 meses atrás |
samplers.json
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
10 meses atrás |
serving_chat.py
|
1264e0b5d8
api: add mistral function calling format to all models loaded with "mistral" format (#1053)
|
3 semanas atrás |
serving_completions.py
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
4 semanas atrás |
serving_embedding.py
|
0c162c8dad
api: use fp32 for base64 embeddings (#919)
|
1 mês atrás |
serving_engine.py
|
c5c09720b0
api: log prompt truncation (#940)
|
1 mês atrás |
serving_tokenization.py
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
4 semanas atrás |