AlpinDale
|
071269e406
feat: FP8 E4M3 KV Cache (#405)
|
9 months ago |
AlpinDale
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
9 months ago |
AlpinDale
|
e702f587cf
feat: add batched RoPE kernels (#371)
|
9 months ago |
AlpinDale
|
3d6695cfbb
feat: add approximate gelu activation kernels (#370)
|
9 months ago |
AlpinDale
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
10 months ago |
AlpinDale
|
e0c35bb353
feat: bitsandbytes and `--load-in{4,8}bit` support (#294)
|
10 months ago |
AlpinDale
|
705821a7fe
feat: AQLM quantization support (#293)
|
10 months ago |
AlpinDale
|
e31c6f0b45
feat: refactor modeling logic and support more models (#274)
|
10 months ago |
AlpinDale
|
d9b65e6c5f
feat: DeepSeek MoE support (#237)
|
11 months ago |
AlpinDale
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
11 months ago |
AlpinDale
|
31c95011a6
feat: FP8 E5M2 KV Cache (#226)
|
11 months ago |
AlpinDale
|
641bb0f6e9
feat: add custom allreduce kernels (#224)
|
11 months ago |
AlpinDale
|
5053743c1c
feat: speedup AWQ (#223)
|
11 months ago |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 months ago |
AlpinDale
|
7e72ce0a73
feat: mixtral tensor parallelism (#193)
|
1 year ago |
AlpinDale
|
15a0454172
feat: FP8 KV Cache (#185)
|
1 year ago |
AlpinDale
|
62b2c4119d
feat: re-write GPTQ and refactor exllama kernels (#152)
|
1 year ago |
AlpinDale
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 year ago |
AlpinDale
|
2b1ba581f9
feat: re-implement GPTQ (#141)
|
1 year ago |
AlpinDale
|
8223f85c1b
feat: SqueezeLLM support (#140)
|
1 year ago |
AlpinDale
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
1 year ago |