AlpinDale
|
135dfd648b
fix: LoRA support for Cohere and Jamba models (#1004)
|
1 month ago |
AlpinDale
|
3bb0f07461
chore: rename `task_handler` to `worker` (#985)
|
1 month ago |
AlpinDale
|
0dfa6b60ec
core: support logprobs with multi-step scheduling (#963)
|
1 month ago |
AlpinDale
|
9f3e7c86e2
feat: add fused Marlin MoE kernel (#934)
|
1 month ago |
50h100a
|
9022c6d869
remove progress_bar imports
|
3 months ago |
50h100a
|
9576096b9d
iterate over weights normally
|
3 months ago |
AlpinDale
|
d34e083c48
feat: add experts_int8 support (#730)
|
4 months ago |
AlpinDale
|
4ec08af18b
chore: update fused MoE weight loading (#700)
|
4 months ago |
AlpinDale
|
0e558e9b2f
fix: loading chameleon model with TP>1 (#695)
|
4 months ago |
AlpinDale
|
3f712cd287
feat: add progress bar for loading individual weight modules (#640)
|
4 months ago |
AlpinDale
|
bf88c8567e
feat: mamba model support (#674)
|
4 months ago |
AlpinDale
|
8583aefed7
chore: mamba cache single buffer (#673)
|
4 months ago |
AlpinDale
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |