AlpinDale
|
fc5ef786b0
Merge branch 'main' into lm_head_lora
|
1 week ago |
AlpinDale
|
f20f5c3491
samplers: improved DRY performance (#1108)
|
1 week ago |
AlpinDale
|
2dc917fcfd
fix: install the headless opencv
|
1 week ago |
AlpinDale
|
eb1ffacf74
Spec Decoding: fix typical acceptance sampler with correct recovered tok IDs (#1106)
|
1 week ago |
AlpinDale
|
76088aa43a
distributed: allow IPv6 in APHRODITE_HOST_IP with ZMQ (#1105)
|
1 week ago |
AlpinDale
|
69cf654901
LoRA: add assertions for SGMV kernels to avoid incorrect results (#1104)
|
1 week ago |
AlpinDale
|
c90abcc603
VLM: add pipeline parallelism support for Qwen2-VL (#1103)
|
1 week ago |
AlpinDale
|
cc5e185795
VLM: support passing multimodal processor kwargs (#1102)
|
1 week ago |
AlpinDale
|
1448857bd3
XPU: fix docker build
|
1 week ago |
AlpinDale
|
c36dd3a4b6
build: fix CPU CMake compilation
|
1 week ago |
AlpinDale
|
98e174b1f4
build: fix cutlass fetch warning
|
1 week ago |
AlpinDale
|
a0f0160b79
spec decode: remove dead code from draft bonus tokens (#1101)
|
2 weeks ago |
AlpinDale
|
a5bfc2bc3d
VLM: add support for LLaVA-Onevision model (#1100)
|
2 weeks ago |
AlpinDale
|
d44da0332c
misc: rename `CudaMemoryProfiler` to `DeviceMemoryProfiler` (#1099)
|
2 weeks ago |
AlpinDale
|
7ce3174039
VLM: refactor blip models to support composite weight loading (#1098)
|
2 weeks ago |
AlpinDale
|
91d03c04d2
VLM: refactor composite weight loading logic (#1097)
|
2 weeks ago |
AlpinDale
|
b65449b5ad
moe: refactor DBRX experts to support FusedMoE (#1095)
|
2 weeks ago |
AlpinDale
|
ed63c079f7
Triton: remove atomic add op from awq triton (#1094)
|
2 weeks ago |
AlpinDale
|
651678d2df
VLM: use `SequenceData.from_token_counts` to create dummy data (#1093)
|
2 weeks ago |
AlpinDale
|
7fffa507ff
build: build flash attention kernels inside aphrodite (#1085)
|
2 weeks ago |
AlpinDale
|
3d5b97837f
ci: fix the tag for :latest docker
|
2 weeks ago |
AlpinDale
|
d96c363301
api: fix admin key being required for authentication (#1091)
|
2 weeks ago |
AlpinDale
|
8e7d214d2d
Merge branch 'main' into lm_head_lora
|
1 month ago |
AlpinDale
|
1fac86c325
core: factor out common code in SequenceData (#1083)
|
1 month ago |
AlpinDale
|
ad1205b277
readme: update attributions (#1082)
|
1 month ago |
AlpinDale
|
193fcee016
chore: check for torch 2.4.0 when registering custom op (#1081)
|
1 month ago |
AlpinDale
|
86bf2cc4f3
core: rename `PromptInputs,inputs` -> `PromptType,prompt` (#1080)
|
1 month ago |
AlpinDale
|
766ea79b89
vlm: fix feature size calculation for llava-next models (#1079)
|
1 month ago |
AlpinDale
|
7b6501bd05
tests: refactor model tests (#1078)
|
1 month ago |
AlpinDale
|
f6df92bde0
fix: unexpected kwarg for the legacy API server (#1076)
|
1 month ago |