AlpinDale
|
2b85ffb1a5
chore: minor cleanups
|
4 hónapja |
AlpinDale
|
4d4e767838
ci: take one of fixing lint issues
|
4 hónapja |
AlpinDale
|
0e6c400b13
feat: re-add GGUF (#600)
|
4 hónapja |
AlpinDale
|
edffcecc67
chore: add proper logging for spec decoding verification
|
4 hónapja |
AlpinDale
|
d357341203
chore: add pipeline parallel support for Qwen
|
4 hónapja |
AlpinDale
|
98f9dbd734
feat: Triton Kernels for Punica (#613)
|
4 hónapja |
AlpinDale
|
07cc8a56bb
fix: add nemotron to PP_SUPPORTED_MODELS
|
4 hónapja |
AlpinDale
|
ea838abb6b
fix: disable enforce_eager for bnb
|
4 hónapja |
AlpinDale
|
fce2c2e304
fix: support ignore patterns in model loader
|
4 hónapja |
AlpinDale
|
cb44c8daa8
feat: support FP8 KV Cache scales from compressed-tensors
|
4 hónapja |
AlpinDale
|
ba371fbbbd
feat: AWQ marlin kernels (#603)
|
4 hónapja |
AlpinDale
|
a4cbcfe59f
feat: disable logprob serialization to CPU for spec decode
|
4 hónapja |
AlpinDale
|
9be43994fe
feat: fbgemm quantization support (#601)
|
4 hónapja |
AlpinDale
|
45a004874c
chore: allow specifying custom Executor
|
4 hónapja |
AlpinDale
|
b7a2d52e47
fix: allow using mp executor for pipeline parallel
|
4 hónapja |
AlpinDale
|
6671e3a162
feat: add CPU offloading support (#598)
|
4 hónapja |
AlpinDale
|
ee2c5d34da
feat: add fp8 channel-wise weight quantization support
|
4 hónapja |
AlpinDale
|
6c4c20652b
feat: pipeline parallel support for mixtral
|
4 hónapja |
AlpinDale
|
5289c14b24
feat: Asymmetric Tensor Parallel (#594)
|
4 hónapja |
AlpinDale
|
ddb28a80a3
fix: bump torch for rocm, unify CUDA_VISIBLE_DEVICES for cuda and rocm
|
4 hónapja |
AlpinDale
|
99680b2d23
feat: soft prompts (#589)
|
4 hónapja |
AlpinDale
|
5761ef8c35
feat: gemma-2 support
|
4 hónapja |
AlpinDale
|
1ff6d4c3d7
feat: support pipeline parallel on indivisible GPU count (#587)
|
4 hónapja |
AlpinDale
|
4f7d212b70
feat: remove vision language config
|
4 hónapja |
AlpinDale
|
bdf1cc1aec
fix: allow using custom all reduce when pp_size > 1
|
4 hónapja |
AlpinDale
|
5240c0da23
fix: avoid unnecessary ray import warnings
|
4 hónapja |
AlpinDale
|
5be90c3859
Mamba infrastrucuture support (#586)
|
4 hónapja |
AlpinDale
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
4 hónapja |
AlpinDale
|
dd378ea063
feat: MLPSpeculator with tensor parallel
|
4 hónapja |
AlpinDale
|
3a0fdf7b9b
chore: remove `image_input_type` from VLM config
|
4 hónapja |