PygmalionAI's large-scale inference engine
pygmalion.chat
It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention).
|
před 1 rokem | |
---|---|---|
aphrodite | před 1 rokem | |
assets | před 1 rokem | |
kernels | před 1 rokem | |
.gitignore | před 1 rokem | |
LICENSE | před 1 rokem | |
README.md | před 1 rokem | |
requirements.txt | před 1 rokem |
Aphrodite is the official backend engine for PygmalionAI. It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to FasterTransformer).
Aphrodite builds upon and integrates the exceptional work from various projects, including:
You will likely need a CUDA version of at least 11.0, and a Compute Capability of at least 7, 0
. CUDA 12.0 is unsupported, so please switch to 11.8!
Linux-only
We accept PRs! There will likely be a few typos or other errors we've failed to catch, so please let us know either via an issue or make a Pull Request.