AlpinDale
|
80be38ca6f
chore: expose phi3_v num_crops as an mm_processor_kwargs (#1117)
|
1 주 전 |
AlpinDale
|
18e0b0e932
chore: support loading weights by ID within models (#1116)
|
1 주 전 |
AlpinDale
|
949f974c59
(1/N) XQA: integrate the XQA CUDA kernels within Aphrodite (#1115)
|
1 주 전 |
AlpinDale
|
2cd6ef2d5a
misc: skip dumping inputs when unpicklable
|
2 주 전 |
AlpinDale
|
be0b0c13ca
tests: update scheduler tests (#1113)
|
2 주 전 |
AlpinDale
|
5b03d67abb
Core: add output streaming support to multi-step + async (#1112)
|
2 주 전 |
AlpinDale
|
b20c4570d2
CI: bump aphrodite-engine to v0.6.6 (#1111)
|
2 주 전 |
AlpinDale
|
7c825e50be
fix: correct FP8 support check on Ada+ GPUs by using compressed-tensors (#1110)
|
2 주 전 |
AlpinDale
|
2bb9c9c399
Revert "CI: use self-hosted runner for the build job"
|
2 주 전 |
AlpinDale
|
c71d2cf814
CI: use self-hosted runner for the build job
|
2 주 전 |
AlpinDale
|
5c00851691
tests: fix ruff for llava onevision tests
|
2 주 전 |
AlpinDale
|
6d8df254c7
LoRA: skip loading unsupported weight modules (#1109)
|
2 주 전 |
AlpinDale
|
f20f5c3491
samplers: improved DRY performance (#1108)
|
2 주 전 |
AlpinDale
|
2dc917fcfd
fix: install the headless opencv
|
2 주 전 |
AlpinDale
|
eb1ffacf74
Spec Decoding: fix typical acceptance sampler with correct recovered tok IDs (#1106)
|
2 주 전 |
AlpinDale
|
76088aa43a
distributed: allow IPv6 in APHRODITE_HOST_IP with ZMQ (#1105)
|
2 주 전 |
AlpinDale
|
69cf654901
LoRA: add assertions for SGMV kernels to avoid incorrect results (#1104)
|
2 주 전 |
AlpinDale
|
c90abcc603
VLM: add pipeline parallelism support for Qwen2-VL (#1103)
|
2 주 전 |
AlpinDale
|
cc5e185795
VLM: support passing multimodal processor kwargs (#1102)
|
2 주 전 |
AlpinDale
|
1448857bd3
XPU: fix docker build
|
2 주 전 |
AlpinDale
|
c36dd3a4b6
build: fix CPU CMake compilation
|
2 주 전 |
AlpinDale
|
98e174b1f4
build: fix cutlass fetch warning
|
2 주 전 |
AlpinDale
|
a0f0160b79
spec decode: remove dead code from draft bonus tokens (#1101)
|
2 주 전 |
AlpinDale
|
a5bfc2bc3d
VLM: add support for LLaVA-Onevision model (#1100)
|
2 주 전 |
AlpinDale
|
d44da0332c
misc: rename `CudaMemoryProfiler` to `DeviceMemoryProfiler` (#1099)
|
2 주 전 |
AlpinDale
|
7ce3174039
VLM: refactor blip models to support composite weight loading (#1098)
|
2 주 전 |
AlpinDale
|
91d03c04d2
VLM: refactor composite weight loading logic (#1097)
|
2 주 전 |
AlpinDale
|
b65449b5ad
moe: refactor DBRX experts to support FusedMoE (#1095)
|
2 주 전 |
AlpinDale
|
ed63c079f7
Triton: remove atomic add op from awq triton (#1094)
|
2 주 전 |
AlpinDale
|
651678d2df
VLM: use `SequenceData.from_token_counts` to create dummy data (#1093)
|
2 주 전 |