AlpinDale
|
709628a74d
fix
|
5 months ago |
AlpinDale
|
b0e420e711
some debug statements
|
5 months ago |
AlpinDale
|
31b8636e89
don't do regular top-k/top-p sampling if kernels are enabled
|
5 months ago |
AlpinDale
|
815736fc54
feat: add cuda kernels for sampling
|
5 months ago |
AlpinDale
|
dd18c5042c
move prepare_inputs to the GPU (#596)
|
5 months ago |
AlpinDale
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
6 months ago |
AlpinDale
|
d287afd917
optimize get_logprobs
|
6 months ago |
AlpinDale
|
79901b76de
logprobs for target model (spec decoding)
|
6 months ago |
AlpinDale
|
35ae01d7ba
refactor: attention metadata term
|
6 months ago |
AlpinDale
|
ccf4d5cab6
disable banned tokens
|
6 months ago |
AlpinDale
|
9ce319b03c
fix: sampler indexing issues in distributed environments (#546)
|
6 months ago |
AlpinDale
|
772b4a4504
temporarily disable dynatemp
|
6 months ago |
AlpinDale
|
b178bc12b3
fix min_tokens when eos_token_id is None
|
6 months ago |
AlpinDale
|
7d3194e7f4
revert #244
|
6 months ago |
AlpinDale
|
aed64884c6
feat: prompt logprobs with chunked prefill (#539)
|
6 months ago |
AlpinDale
|
fca911ee0a
vLLM Upstream Sync (#526)
|
7 months ago |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
50h100a
|
f663d3fccc
Merge pull request #397 from 50h100a/pr_samplerasserts
|
9 months ago |
50h100a
|
85ae23ac3c
Missed .items() and assert
|
9 months ago |
50h100a
|
43c9858854
Merge pull request #244 from PygmalionAI/faster_topk
|
9 months ago |
50h100a
|
bd564148e2
Merge branch 'main' of https://github.com/PygmalionAI/aphrodite-engine into ffs
|
9 months ago |
50h100a
|
d3dd170a7d
merge main
|
9 months ago |
AlpinDale
|
78d66f16d1
Chunked Prefill Part 1 (#384)
|
10 months ago |
AlpinDale
|
9181fa0396
feat: Triton kernels for sampling (#383)
|
10 months ago |
50h100a
|
dc09dc2b4d
Merge branch 'main' into pr_samplers
|
10 months ago |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
10 months ago |
50h100a
|
7ed57e318d
Overhauled SamplingTensors construction.
|
10 months ago |
50h100a
|
d5dbd29db4
hoist sampler internals into a single function.
|
10 months ago |
AlpinDale
|
da223153c6
feat&fix: cohere support and missing GPU blocks (#333)
|
10 months ago |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
10 months ago |