.. |
__init__.py
|
fa15bad2ea
chore: minor AMD fixes
|
5 mēneši atpakaļ |
arctic.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
baichuan.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
bloom.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
chameleon.py
|
a0d031efcc
feat: initial text-to-text support for Chameleon model
|
5 mēneši atpakaļ |
chatglm.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
clip.py
|
e26a4ac698
chore: avoid loading the unused layers and init the VLM up to the required feature space
|
5 mēneši atpakaļ |
commandr.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
dbrx.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
decilm.py
|
56e0b8223c
chore: add base class for LoRA-supported models
|
6 mēneši atpakaļ |
deepseek.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
deepseek_v2.py
|
1efd0f89b7
feat: support FP8 for DeepSeekV2 MoE
|
5 mēneši atpakaļ |
falcon.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
fuyu.py
|
e13a66925c
feat: add fuyu vision model and persimmon language model support
|
5 mēneši atpakaļ |
gemma.py
|
05e45aeb53
fix: dtype mismatch for paligemma
|
5 mēneši atpakaļ |
gemma2.py
|
5761ef8c35
feat: gemma-2 support
|
5 mēneši atpakaļ |
gpt2.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
gpt_bigcode.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
gpt_j.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
gpt_neox.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
interfaces.py
|
e76bbe72eb
chore: handle aborted requests for jamba
|
5 mēneši atpakaļ |
internlm2.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
jais.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
jamba.py
|
f5d52320da
Port mamba kernels to Aphrodite (#595)
|
5 mēneši atpakaļ |
llama.py
|
00503b9fc1
feat: non-uniform quantization via `compressed-tensors` for llama
|
5 mēneši atpakaļ |
llama_embedding.py
|
50b7c13db0
refactor: attention selector (#552)
|
6 mēneši atpakaļ |
llava.py
|
acbdc50a71
fix: `vocab_size` field access in llava
|
5 mēneši atpakaļ |
llava_next.py
|
acbdc50a71
fix: `vocab_size` field access in llava
|
5 mēneši atpakaļ |
medusa.py
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
5 mēneši atpakaļ |
minicpm.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
mixtral.py
|
00503b9fc1
feat: non-uniform quantization via `compressed-tensors` for llama
|
5 mēneši atpakaļ |
mixtral_quant.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
mlp_speculator.py
|
db73f03cdc
fix: use ParallelLMHead for MLPSpeculator
|
5 mēneši atpakaļ |
mpt.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
olmo.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
opt.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
orion.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
paligemma.py
|
05e45aeb53
fix: dtype mismatch for paligemma
|
5 mēneši atpakaļ |
persimmon.py
|
e13a66925c
feat: add fuyu vision model and persimmon language model support
|
5 mēneši atpakaļ |
phi.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
phi3_small.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
phi3v.py
|
ad68d149d8
chore: refactor and decouple phi3v image embedding
|
5 mēneši atpakaļ |
qwen.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
qwen2.py
|
fb4c01740c
feat: add asymmetric TP support for Qwen2
|
5 mēneši atpakaļ |
qwen2_moe.py
|
1efd0f89b7
feat: support FP8 for DeepSeekV2 MoE
|
5 mēneši atpakaļ |
stablelm.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
starcoder2.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |
utils.py
|
00503b9fc1
feat: non-uniform quantization via `compressed-tensors` for llama
|
5 mēneši atpakaļ |
xverse.py
|
0f4a9ee77b
quantized lm_head (#582)
|
5 mēneši atpakaļ |