AlpinDale
|
9be43994fe
feat: fbgemm quantization support (#601)
|
5 месяцев назад |
AlpinDale
|
00503b9fc1
feat: non-uniform quantization via `compressed-tensors` for llama
|
5 месяцев назад |
AlpinDale
|
e1475fbec7
feat: MoE support with Pallas GMM kernel for TPUs
|
6 месяцев назад |
AlpinDale
|
4bbf66451a
chore: add CustomAP interface to UnquantizedFusedMoEMethod
|
6 месяцев назад |
AlpinDale
|
1efd0f89b7
feat: support FP8 for DeepSeekV2 MoE
|
6 месяцев назад |
AlpinDale
|
cf472315cc
refactor: isolate FP8 from mixtral
|
6 месяцев назад |