AlpinDale
|
172dee2573
(2/N) Triton Backend: integrate Triton activation kernels (#1126)
|
3 gün önce |
AlpinDale
|
0c17153073
(1/N) Triton Backend: integrate Triton layernorm kernels (#1125)
|
5 gün önce |
AlpinDale
|
349a612338
chore: bump bitsandbytes version to latest; enable cuda graphs for 4bit bnb (#1123)
|
5 gün önce |
AlpinDale
|
ede17d5039
fix: torch.compile dynamo fix (#1122)
|
5 gün önce |
AlpinDale
|
e294eede32
Quantization: re-enable awq_marlin serialization (#1121)
|
5 gün önce |
AlpinDale
|
3e6addcc2c
LLM: enable batched inference for llm.chat() API (#1120)
|
5 gün önce |
AlpinDale
|
fa84f8102e
kernels: split marlin kernels for faster compile, fix MoE, temporarily remove HQQ (#1119)
|
5 gün önce |
AlpinDale
|
6b75a66c60
fix: unsafe all-reduce sync (#1118)
|
6 gün önce |
AlpinDale
|
80be38ca6f
chore: expose phi3_v num_crops as an mm_processor_kwargs (#1117)
|
1 hafta önce |
AlpinDale
|
18e0b0e932
chore: support loading weights by ID within models (#1116)
|
1 hafta önce |
AlpinDale
|
949f974c59
(1/N) XQA: integrate the XQA CUDA kernels within Aphrodite (#1115)
|
1 hafta önce |
AlpinDale
|
2cd6ef2d5a
misc: skip dumping inputs when unpicklable
|
2 hafta önce |
AlpinDale
|
be0b0c13ca
tests: update scheduler tests (#1113)
|
2 hafta önce |
AlpinDale
|
5b03d67abb
Core: add output streaming support to multi-step + async (#1112)
|
2 hafta önce |
AlpinDale
|
b20c4570d2
CI: bump aphrodite-engine to v0.6.6 (#1111)
|
2 hafta önce |
AlpinDale
|
7c825e50be
fix: correct FP8 support check on Ada+ GPUs by using compressed-tensors (#1110)
|
2 hafta önce |
AlpinDale
|
2bb9c9c399
Revert "CI: use self-hosted runner for the build job"
|
2 hafta önce |
AlpinDale
|
c71d2cf814
CI: use self-hosted runner for the build job
|
2 hafta önce |
AlpinDale
|
5c00851691
tests: fix ruff for llava onevision tests
|
2 hafta önce |
AlpinDale
|
6d8df254c7
LoRA: skip loading unsupported weight modules (#1109)
|
2 hafta önce |
AlpinDale
|
f20f5c3491
samplers: improved DRY performance (#1108)
|
2 hafta önce |
AlpinDale
|
2dc917fcfd
fix: install the headless opencv
|
2 hafta önce |
AlpinDale
|
eb1ffacf74
Spec Decoding: fix typical acceptance sampler with correct recovered tok IDs (#1106)
|
2 hafta önce |
AlpinDale
|
76088aa43a
distributed: allow IPv6 in APHRODITE_HOST_IP with ZMQ (#1105)
|
2 hafta önce |
AlpinDale
|
69cf654901
LoRA: add assertions for SGMV kernels to avoid incorrect results (#1104)
|
2 hafta önce |
AlpinDale
|
c90abcc603
VLM: add pipeline parallelism support for Qwen2-VL (#1103)
|
2 hafta önce |
AlpinDale
|
cc5e185795
VLM: support passing multimodal processor kwargs (#1102)
|
2 hafta önce |
AlpinDale
|
1448857bd3
XPU: fix docker build
|
2 hafta önce |
AlpinDale
|
c36dd3a4b6
build: fix CPU CMake compilation
|
2 hafta önce |
AlpinDale
|
98e174b1f4
build: fix cutlass fetch warning
|
2 hafta önce |