sgsdxzy
|
a3b1602391
fix: rope scaling for cohere and qwen (#436)
|
hace 9 meses |
AlpinDale
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
hace 9 meses |
AlpinDale
|
76f36af704
feat: LM Format Enforcer support (#428)
|
hace 9 meses |
AlpinDale
|
c18bf116da
fix stop strings not being excluded from outputs
|
hace 9 meses |
AlpinDale
|
fe17712f29
fully working chunked prefill
|
hace 9 meses |
AlpinDale
|
4d33ce60da
feat: Triton flash attention backend for ROCm (#407)
|
hace 9 meses |
AlpinDale
|
082d4e6972
feat: add chunked prefill scheduler (#406)
|
hace 9 meses |
AlpinDale
|
7528e0ce3e
make detokenization optional
|
hace 9 meses |
AlpinDale
|
071269e406
feat: FP8 E4M3 KV Cache (#405)
|
hace 9 meses |
AlpinDale
|
6f00203041
refactor scheduler for chunked prefill, remove reorder policy for now
|
hace 9 meses |
AlpinDale
|
9aaeb5d349
add speculative config and arg for later
|
hace 9 meses |
AlpinDale
|
a304f76d89
feat: Intel CPU support (#403)
|
hace 9 meses |
AlpinDale
|
f845a661dd
Chunked Prefill Part 2: data update
|
hace 10 meses |
AlpinDale
|
753f6dc51b
add v2 block manager
|
hace 10 meses |
AlpinDale
|
3f5ce50c19
add stop_reason
|
hace 10 meses |
AlpinDale
|
7b9c08afae
vision model support
|
hace 10 meses |
AlpinDale
|
0c4ead5e9f
min_tokens
|
hace 10 meses |
AlpinDale
|
d1786645a3
fix formatting
|
hace 10 meses |
AlpinDale
|
eed70dff76
improve detokenization performance; improve logprobs
|
hace 10 meses |
AlpinDale
|
2319b411ce
refactor: neuron support
|
hace 10 meses |
AlpinDale
|
c9cb00c2a1
add warning for mismatch in vocab size
|
hace 10 meses |
AlpinDale
|
feb5840f2a
feat: async tokenization (#374)
|
hace 10 meses |
AlpinDale
|
0f6d56b07f
feat: model executor refactor (#367)
|
hace 10 meses |
AlpinDale
|
b361096463
fix: tokenizer when using ray (#366)
|
hace 10 meses |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
hace 10 meses |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
hace 10 meses |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
hace 10 meses |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
hace 11 meses |
AlpinDale
|
2d3d44b3e9
chore: add health check for ray workers (#290)
|
hace 11 meses |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
hace 11 meses |