AlpinDale
|
d8c4193704
feat: Speculative Decoding using a draft model (#432)
|
9 mēneši atpakaļ |
50h100a
|
f67b5be198
chore: port sampler+metadata changes from main to dev (#427)
|
9 mēneši atpakaļ |
AlpinDale
|
fa083286e3
Speculative Decoding Part 4: Lookahead scheduling (#402)
|
10 mēneši atpakaļ |
AlpinDale
|
3abc641d68
directly use in forward pass
|
10 mēneši atpakaļ |
AlpinDale
|
c3c374396b
logprobs fixes
|
10 mēneši atpakaļ |
AlpinDale
|
2efee6bcc6
optimize logprob ranks
|
10 mēneši atpakaļ |
AlpinDale
|
777b6f6d51
add logprob ranks
|
10 mēneši atpakaļ |
AlpinDale
|
0c4ead5e9f
min_tokens
|
10 mēneši atpakaļ |
AlpinDale
|
d1786645a3
fix formatting
|
10 mēneši atpakaļ |
AlpinDale
|
f01c668259
clean up sampler
|
10 mēneši atpakaļ |
AlpinDale
|
78d66f16d1
Chunked Prefill Part 1 (#384)
|
10 mēneši atpakaļ |
AlpinDale
|
9181fa0396
feat: Triton kernels for sampling (#383)
|
10 mēneši atpakaļ |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
10 mēneši atpakaļ |
50h100a
|
d5dbd29db4
hoist sampler internals into a single function.
|
10 mēneši atpakaļ |
AlpinDale
|
da223153c6
feat&fix: cohere support and missing GPU blocks (#333)
|
11 mēneši atpakaļ |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
11 mēneši atpakaļ |
AlpinDale
|
9fa99215f8
feat: add cubic sampling (#280)
|
11 mēneši atpakaļ |
AlpinDale
|
657aec0cbd
refactor: OpenAI endpoint (#261)
|
11 mēneši atpakaļ |
AlpinDale
|
4d04ade9ef
feat: fine-grained seeds (#279)
|
11 mēneši atpakaļ |
anon998
|
35b9033782
fix: crash in quadratic sampling when batch > 1 (#253)
|
1 gadu atpakaļ |
50h100a
|
f619c96c79
fix: zero token output due to temperature bias (#243)
|
1 gadu atpakaļ |
50h100a
|
53a9c60442
fix: logit processor declarations and application (#242)
|
1 gadu atpakaļ |
AlpinDale
|
e73a92ad2f
fix: remove the mask for quadratic sampling (#236)
|
1 gadu atpakaļ |
AlpinDale
|
1c46fa31ad
feat: add quadratic sampling (#233)
|
1 gadu atpakaļ |
AlpinDale
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 gadu atpakaļ |
AlpinDale
|
c0aac15421
feat: S-LoRA support (#222)
|
1 gadu atpakaļ |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 gadu atpakaļ |
Stefan Gligorijevic
|
9e7e108dc8
chore: clamp dynatemp_min (#214)
|
1 gadu atpakaļ |
Stefan Gligorijevic
|
56446a04bb
feat: dynamic temperature (#209)
|
1 gadu atpakaļ |
AlpinDale
|
d54791aaa8
feat: reduce sampler overhead by making it less blocking (#198)
|
1 gadu atpakaļ |