1
0
AlpinDale a3c03db735 fix: inline model loading conflicts with lora (#930) 1 сар өмнө
..
rpc 53d0ba7c7c api: add endpoint for loading and unloading the model (#926) 1 сар өмнө
__init__.py 07aa2a492f upstream: add option to specify tokenizer 1 жил өмнө
api_server.py a3c03db735 fix: inline model loading conflicts with lora (#930) 1 сар өмнө
args.py d46e70ac98 api: add inline model loading (#928) 1 сар өмнө
logits_processors.py 62111fab17 feat: allow serving encoder-decoder models in the API server (#664) 4 сар өмнө
protocol.py f61acdd3ec api: add json_schema to OpenAI server (#915) 1 сар өмнө
run_batch.py 81fa31bcaf feat: embeddings support for batched OAI endpoint (#676) 4 сар өмнө
samplers.json ac82b67f75 feat: naive context shift and various QoL changes (#289) 10 сар өмнө
serving_chat.py 61c7182491 feat: enable prompt logprobs in OpenAI API (#720) 4 сар өмнө
serving_completions.py 61c7182491 feat: enable prompt logprobs in OpenAI API (#720) 4 сар өмнө
serving_embedding.py 0c162c8dad api: use fp32 for base64 embeddings (#919) 1 сар өмнө
serving_engine.py 1d3a1fec47 feat: add load/unload endpoints for soft-prompts (#694) 4 сар өмнө
serving_tokenization.py 3648170750 fix: gracefully handle missing chat template (#642) 4 сар өмнө