AlpinDale
|
577586309d
chore: multi-step args and sequence modifications (#713)
|
há 4 meses atrás |
AlpinDale
|
ef40c05cd3
fix: minor adjustments to scheduler and block manager (#667)
|
há 4 meses atrás |
AlpinDale
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
há 4 meses atrás |
AlpinDale
|
62111fab17
feat: allow serving encoder-decoder models in the API server (#664)
|
há 4 meses atrás |
AlpinDale
|
a0e446a17d
feat: initial encoder-decoder support with BART model (#633)
|
há 4 meses atrás |
AlpinDale
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
há 4 meses atrás |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
há 8 meses atrás |
AlpinDale
|
9181fa0396
feat: Triton kernels for sampling (#383)
|
há 9 meses atrás |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
há 9 meses atrás |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
há 10 meses atrás |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
há 10 meses atrás |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
há 10 meses atrás |
AlpinDale
|
657aec0cbd
refactor: OpenAI endpoint (#261)
|
há 10 meses atrás |
AlpinDale
|
4d04ade9ef
feat: fine-grained seeds (#279)
|
há 10 meses atrás |
AlpinDale
|
d2db4143fa
feat: add grafana for metrics (#240)
|
há 11 meses atrás |
AlpinDale
|
c0aac15421
feat: S-LoRA support (#222)
|
há 11 meses atrás |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
há 11 meses atrás |
AlpinDale
|
f013d714c0
chore: merge dev branch into main (#177)
|
há 1 ano atrás |
AlpinDale
|
2755a48d51
merge dev branch into main (#153)
|
há 1 ano atrás |
50h100a
|
fa0ae5a2c9
feat: new mirostatv2 implementation (#96)
|
há 1 ano atrás |
AlpinDale
|
efc6f7fbec
chore: reformats (#90)
|
há 1 ano atrás |
AlpinDale
|
e6be0118c9
feat: prompt logprobs and batched samplers (#77)
|
há 1 ano atrás |
AlpinDale
|
75c27d3e65
massive overhaul
|
há 1 ano atrás |
AlpinDale
|
6dfca14e1f
compute logprobs with log_softmax instead of log
|
há 1 ano atrás |
AlpinDale
|
6b9561ef07
adapt TGI incremental detokenization
|
há 1 ano atrás |
AlpinDale
|
45f6d9f923
initial refactor commit
|
há 1 ano atrás |
AlpinDale
|
f4bb602b74
chore: remove redundant import and minor refactor
|
há 1 ano atrás |
AlpinDale
|
c761d38c69
fix: sort outputs and avoid unwanted list copy
|
há 1 ano atrás |
AlpinDale
|
7a27bd5f2f
fix: do not allow prompt to exceed max input len
|
há 1 ano atrás |
AlpinDale
|
fefbf029c9
revert previous commit
|
há 1 ano atrás |