AlpinDale
|
1efd0f89b7
feat: support FP8 for DeepSeekV2 MoE
|
6 months ago |
AlpinDale
|
cdc0e498a9
fix: illegal memory access in FP8 MoE kernel
|
6 months ago |
AlpinDale
|
3e7d5f7d14
chore: reloading fused_moe config on the last chunk
|
6 months ago |
AlpinDale
|
3b2666314d
fix: add chunking mechanism to fused_moe
|
6 months ago |
AlpinDale
|
336eb4dbf8
fix: raise error in moe kernel if it receives more than 65k tokens
|
6 months ago |
AlpinDale
|
bbde979ecd
DeepSeek-V2 (#579)
|
6 months ago |
AlpinDale
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
6 months ago |
AlpinDale
|
4bdd2f9892
chore: enhance MoE benchmarking
|
6 months ago |
AlpinDale
|
00acf371f9
rocm: fused topk softmax
|
6 months ago |
AlpinDale
|
1e35cef979
feat: add arctic snowflake model (#551)
|
6 months ago |
AlpinDale
|
0751a2ecf6
fix expert_ids shape in Moe
|
6 months ago |
AlpinDale
|
db9beeb79c
fix typo
|
7 months ago |
AlpinDale
|
b565928d3f
fix: compute_dtype in MoE kernel
|
7 months ago |
AlpinDale
|
36660b55c2
chore: mixtral fp8 w/ static scales (#542)
|
7 months ago |
AlpinDale
|
fca911ee0a
vLLM Upstream Sync (#526)
|
7 months ago |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
9 months ago |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
10 months ago |