AlpinDale
|
e4407bbcb7
fix: do not start a ray cluster when not using ray
|
7 月之前 |
AlpinDale
|
ee174ea4fd
fix: guard for lora + chunked prefill
|
7 月之前 |
AlpinDale
|
f9a10145d1
fix: v2 block manager + prefix caching
|
7 月之前 |
AlpinDale
|
44331a4d00
chore: improve p2p cache generation
|
7 月之前 |
AlpinDale
|
c8c6de64cd
fix: typo in pallas backend
|
7 月之前 |
AlpinDale
|
cc3486477e
fix: benign multiprocessing error
|
7 月之前 |
AlpinDale
|
c482c09a3a
fix: remove duplicated input processing in async engine
|
7 月之前 |
AlpinDale
|
d0afe0cd21
fix: suppress mma.sp warning on CUDA 12.5 and above
|
7 月之前 |
AlpinDale
|
a33aaf3b42
chore: cleanup compressed tensors
|
7 月之前 |
AlpinDale
|
94f4e278ff
fix: illegal mem access for cutlass fp8 kernels
|
7 月之前 |
AlpinDale
|
8c32e49029
feat: add avx2 cpu support
|
7 月之前 |
AlpinDale
|
a89c9a0e92
fix: device ordinal issues with world_size and stuff
|
7 月之前 |
AlpinDale
|
ab7f4ed6e5
chore: revert commit for removing unnecessary copies in flash attn backend
|
7 月之前 |
AlpinDale
|
06ed127441
fix: do not raise optimization warning for fp8 quant
|
7 月之前 |
AlpinDale
|
7e54c3916d
chore: factor out epilogues from cutlass kernels
|
7 月之前 |
AlpinDale
|
a07fc83bc8
chore: proper util for aphrodite version
|
7 月之前 |
AlpinDale
|
805fa8721d
feat: use intel_extension_for_pytorch for CPU backend
|
7 月之前 |
AlpinDale
|
1d00b61622
feat: w4a16 support for compressed-tensors
|
7 月之前 |
AlpinDale
|
34b41e0a87
chore: add coordinator to reduce code duplication in tp and pp
|
7 月之前 |
AlpinDale
|
fdabb55a4d
fix: wrong multi_modal_input format for CPU
|
7 月之前 |
AlpinDale
|
458c8b5e33
chore: estimated input speed for tqdm
|
7 月之前 |
AlpinDale
|
29ddfae8de
fix: typo in scheduler
|
7 月之前 |
AlpinDale
|
b4ddd79f3a
fix: warn user when using outdated compiled binary
|
7 月之前 |
AlpinDale
|
d0cca80b8b
feat: support sharded tensorizer models
|
7 月之前 |
AlpinDale
|
8004c9f782
fix: import for multimodaldata
|
7 月之前 |
AlpinDale
|
37c6da9eb3
feat: vectorized fp8 quant kernel
|
7 月之前 |
AlpinDale
|
a524667db0
fix: device assertion for sdpa backend; fix env for tpu worker
|
7 月之前 |
AlpinDale
|
fe21123a1c
feat: TPU support (#570)
|
7 月之前 |
AlpinDale
|
fa58ba87a3
fix: only set executor backend to mp if not multi-node
|
7 月之前 |
AlpinDale
|
270bd333af
chore: check if process is on the same node
|
7 月之前 |