AlpinDale
|
45a004874c
chore: allow specifying custom Executor
|
5 mesiacov pred |
AlpinDale
|
a26f784240
chore: use the LoRA tokenizer in OpenAI API (#599)
|
5 mesiacov pred |
AlpinDale
|
052a6e1eb6
feat: add SPMD worker execution using Ray accelerated DAG
|
5 mesiacov pred |
AlpinDale
|
0c17c2a8a7
chore: add commit hash, clean up engine logs
|
5 mesiacov pred |
AlpinDale
|
c0c2b1ac20
fix: get_and_reset only when scheduler outputs are not empty
|
5 mesiacov pred |
AlpinDale
|
99680b2d23
feat: soft prompts (#589)
|
5 mesiacov pred |
AlpinDale
|
4f7d212b70
feat: remove vision language config
|
5 mesiacov pred |
AlpinDale
|
5be90c3859
Mamba infrastrucuture support (#586)
|
5 mesiacov pred |
AlpinDale
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
5 mesiacov pred |
AlpinDale
|
9da2448964
fix: ensure worker model loop is always stopped at the right time
|
5 mesiacov pred |
AlpinDale
|
7b04361934
fix: support getting `eos_token_id` from the config file
|
5 mesiacov pred |
AlpinDale
|
b8a19ba27f
chore: extend aphrodite metrics logging api
|
5 mesiacov pred |
AlpinDale
|
0886c361f4
feat: OpenVINO CPU backend (#576)
|
5 mesiacov pred |
AlpinDale
|
c0c336aaa3
refactor: registry for processing model inputs; quick_gelu; clip model support
|
5 mesiacov pred |
AlpinDale
|
b3643a7bd7
fix: min_tokens for when there are multiple eos tokens
|
5 mesiacov pred |
AlpinDale
|
4ed1bb9958
chore: add fault tolerance for RayTokenizerGroupPool
|
5 mesiacov pred |
AlpinDale
|
25feb1d592
chore: add support for pinning lora adapters in the lru cache
|
5 mesiacov pred |
AlpinDale
|
6a57861fca
feat: initial XPU support via intel_extension_for_pytorch (#571)
|
6 mesiacov pred |
AlpinDale
|
a07fc83bc8
chore: proper util for aphrodite version
|
6 mesiacov pred |
AlpinDale
|
fe21123a1c
feat: TPU support (#570)
|
6 mesiacov pred |
AlpinDale
|
90ceab32ff
refactor: consolidate prompt args to LLM engines
|
6 mesiacov pred |
AlpinDale
|
de62ceb18c
refactor: eliminate parallel worker per-step task scheduling overhead
|
6 mesiacov pred |
AlpinDale
|
60e74e92fd
add rope_scaling arg
|
6 mesiacov pred |
AlpinDale
|
c6a501f682
add multiprocessing executor; make ray optional
|
6 mesiacov pred |
AlpinDale
|
342346afda
improve hashing function
|
6 mesiacov pred |
AlpinDale
|
50b7c13db0
refactor: attention selector (#552)
|
6 mesiacov pred |
AlpinDale
|
fd0a5c0ea4
raise a warning during preemption and swapping
|
6 mesiacov pred |
AlpinDale
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
6 mesiacov pred |
AlpinDale
|
ef733aee43
implement ExecuteModelData to reduce executor complexity
|
6 mesiacov pred |
AlpinDale
|
ba3db54a4b
comment out the chunked debug print
|
6 mesiacov pred |