Aphrodite supports a large variety of generative Transformer models in Hugging Face Transformers. The following is the list of model architectures that we currently support.
Architecture | Example HF Model |
---|---|
AquilaForCausalLM |
BAAI/AquilaChat-7B |
ArcticForCausalLM |
Snowflake/snowflake-arctic-instruct |
BaiChuanForCausalLM |
baichuan-inc/Baichuan2-13B-Chat |
BloomForCausalLM |
bigscience/bloomz |
ChatGLMModel |
THUDM/chatglm3-6b |
CohereForCausalLM |
CohereForAI/c4ai-command-r-v01 |
DbrxForCausalLM |
databricks/dbrx-instruct |
DeciLMForCausalLM |
DeciLM/DeciLM-7B |
FalconForCausalLM |
tiiuae/falcon-7b |
GemmaForCausalLM |
google/gemma-7b |
Gemma2ForCausalLM |
google/gemma-2-9b |
GPT2LMHeadModel |
gpt2 |
GPTBigCodeForCausalLM |
bigcode/starcoder |
GPTJForCausalLM |
pygmalionai/pygmalion-6b |
GPTNeoXForCausalLM |
EleutherAI/pythia-12b |
InternLMForCausalLM |
internlm/internlm-7b |
InternLM2ForCausalLM |
internlm/internlm2-7b |
JAISLMHeadModel |
core42/jais-13b |
JambaForCausalLM |
ai21labs/Jamba-v0.1 |
LlamaForCausalLM |
meta-llama/Meta-Llama-3.1-8B |
MiniCPMForCausalLM |
openbmb/MiniCPM-2B-dpo-bf16 |
MistralForCausalLM |
mistralai/Mistral-7B-v0.1 |
MixtralForCausalLM |
mistralai/Mixtral-8x7B-v0.1 |
MPTForCausalLM |
mosaicml/mpt-7b |
NemotronForCausalLM |
nvidia/Minitron-8B-Base |
OLMoForCausalLM |
allenai/OLMo-7B-hf |
OPTForCausalLM |
facebook/opt-66b |
OrionForCausalLM |
OrionStarAI/Orion-14B-Chat |
PhiForCausalLM |
microsoft/phi-2 |
Phi3ForCausalLM |
microsoft/Phi-3-medium-128k-instruct |
Phi3SmallForCausalLM |
microsoft/Phi-3-small-128k-instruct |
PersimmonForCausalLM |
adept/persimmon-8b-chat |
QwenLMHeadModel |
Qwen/Qwen-7B |
Qwen2ForCausalLM |
Qwen/Qwen2-72B |
Qwen2MoeForCausalLM |
Qwen/Qwen1.5-MoE-A2.7B |
StableLmforCausalLM |
stabilityai/stablelm-3b-4e1t |
Starcoder2ForCausalLM |
bigcode/starcoder2-3b |
XverseForCausalLM |
xverse/XVERSE-65B-Chat |
:::info On ROCm platforms, Mistral and Mixtral are capped to 4096 max context length due to sliding window issues. :::
Architecture | Example Model |
---|---|
BartForConditionalGeneration |
facebook/bart-large-cnn |
Architecture | Supported Modalities | Example Model |
---|---|---|
Blip2ForConditionalGeneration |
Image | Salesforce/blip2-opt-6.7b |
ChameleonForConditionalGeneration |
Image | facebook/chameleon-7b |
FuyuForCausalLM |
Image | adept/fuyu-8b |
InternVLChatModel |
Image | OpenGVLab/InternVL2-8B |
LlavaForConditionalGeneration |
Image | llava-hf/llava-v1.5-7b-hf |
LlavaNextForConditionalGeneration |
Image | llava-hf/llava-v1.6-mistral-7b-hf |
PaliGemmaForConditionalGeneration |
Image | google/paligemma-3b-pt-224 |
Phi3VForCausalLM |
Image | microsoft/Phi-3.5-vision-instruct |
MiniCPMV |
Image | openbmb/MiniCPM-V-2_6 |
If your model uses any of the architectures above, you can seamlessly run your model with Aphrodite.