david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ deepseek_v3

AlpinDale a56bce4c94 fix: remove duplicate assignment in Hermes2ProToolParser		há 1 mês atrás
..
rpc	39b2e83ac3 api: optimize zeromq frontend performance (#951)	há 1 mês atrás
tool_parsers	a56bce4c94 fix: remove duplicate assignment in Hermes2ProToolParser	há 1 mês atrás
__init__.py	07aa2a492f upstream: add option to specify tokenizer	há 1 ano atrás
api_server.py	313e198557 api: implement OpenAI-compatible tools API for Hermes/Mistral models (#993)	há 1 mês atrás
args.py	313e198557 api: implement OpenAI-compatible tools API for Hermes/Mistral models (#993)	há 1 mês atrás
logits_processors.py	62111fab17 feat: allow serving encoder-decoder models in the API server (#664)	há 4 meses atrás
protocol.py	0191c5efd1 tools: fix tool calls to more strictly follow OpenAI format (#1003)	há 1 mês atrás
run_batch.py	81fa31bcaf feat: embeddings support for batched OAI endpoint (#676)	há 4 meses atrás
samplers.json	ac82b67f75 feat: naive context shift and various QoL changes (#289)	há 11 meses atrás
serving_chat.py	7d5feaa037 api: fix logic for deciding if tool parser is used (#1025)	há 1 mês atrás
serving_completions.py	61c7182491 feat: enable prompt logprobs in OpenAI API (#720)	há 4 meses atrás
serving_embedding.py	0c162c8dad api: use fp32 for base64 embeddings (#919)	há 1 mês atrás
serving_engine.py	c5c09720b0 api: log prompt truncation (#940)	há 1 mês atrás
serving_tokenization.py	313e198557 api: implement OpenAI-compatible tools API for Hermes/Mistral models (#993)	há 1 mês atrás