AlpinDale
|
224280ad45
Merge branch 'main' into multi-step
|
1 month ago |
AlpinDale
|
2242dc7965
formatting
|
1 month ago |
AlpinDale
|
2242841380
add to benchmark
|
1 month ago |
AlpinDale
|
2242cb25dc
fix: unbound tokenizer error
|
1 month ago |
AlpinDale
|
224290913a
remove kv cache estimation
|
1 month ago |
AlpinDale
|
22420ca5d0
add tests
|
1 month ago |
AlpinDale
|
2242a70c3c
patch gpu executors
|
1 month ago |
AlpinDale
|
2242e966b7
async engine impl
|
1 month ago |
AlpinDale
|
2242981764
switch to cpu from numpy for sampled token ids
|
1 month ago |
AlpinDale
|
224262f125
broadcastable model input in worker base
|
1 month ago |
AlpinDale
|
2242a835e7
add multistep worker
|
1 month ago |
AlpinDale
|
22428c9911
add multistep model runner
|
1 month ago |
AlpinDale
|
2242284b19
add broadcastable model input base
|
1 month ago |
AlpinDale
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
1 month ago |
AlpinDale
|
22425b689d
fix: XPU build
|
1 month ago |
AlpinDale
|
bfc8988116
feat: add cuda sampling kernels for top_k and top_p (#828)
|
1 month ago |
AlpinDale
|
22427602eb
feat: add top-nsigma sampling method
|
1 month ago |
AlpinDale
|
22429e4a10
fix: sampler test with new transformers version
|
1 month ago |
AlpinDale
|
2f61644f6e
SPMD optimizations (#824)
|
1 month ago |
AlpinDale
|
32a37e8107
tests: partially fix tensorizer and logprobs tests
|
1 month ago |
AlpinDale
|
7f1c9af5e2
fix: fp8 quant test
|
1 month ago |
AlpinDale
|
173ac23399
fix: experts int8 quant test
|
1 month ago |
AlpinDale
|
68f050129d
fix: lora worker manager test import
|
1 month ago |
AlpinDale
|
3661de812d
fix: lora layer test
|
1 month ago |
AlpinDale
|
0a369f9171
feat: support chunked prefill with LoRA (#823)
|
1 month ago |
AlpinDale
|
e5b1afe625
feat: add chat method for LLM class (#822)
|
1 month ago |
AlpinDale
|
262cbc63b7
fix: vision api test template path
|
1 month ago |
AlpinDale
|
b0113a1eaa
fix: tokenization api test (#821)
|
1 month ago |
AlpinDale
|
c6c91edab7
ci: update & overhaul test units (#769)
|
1 month ago |
AlpinDale
|
f088ea81c7
fix: --max-seq-len-to-capture arg (#818)
|
1 month ago |