AlpinDale
|
86bf2cc4f3
core: rename `PromptInputs,inputs` -> `PromptType,prompt` (#1080)
|
1 day ago |
AlpinDale
|
1264e0b5d8
api: add mistral function calling format to all models loaded with "mistral" format (#1053)
|
1 week ago |
AlpinDale
|
a985143768
core: add cuda graph support for encoder-decoder models (#1051)
|
1 week ago |
AlpinDale
|
055c8905a3
api: add sampling/engine option to return only deltas or final output (#1035)
|
1 week ago |
AlpinDale
|
f644e10449
vlm: enable multimodal inputs for the LLM class (#992)
|
2 weeks ago |
AlpinDale
|
f7f3fed265
feat: add async postprocessor (#925)
|
2 weeks ago |
AlpinDale
|
f797294b29
fix: `add_generation_template` -> `add_generation_prompt` in llm (#877)
|
4 weeks ago |
AlpinDale
|
e5b1afe625
feat: add chat method for LLM class (#822)
|
1 month ago |
AlpinDale
|
62111fab17
feat: allow serving encoder-decoder models in the API server (#664)
|
4 months ago |
AlpinDale
|
a0e446a17d
feat: initial encoder-decoder support with BART model (#633)
|
4 months ago |
AlpinDale
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 months ago |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
9 months ago |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
10 months ago |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
10 months ago |
AlpinDale
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
11 months ago |
AlpinDale
|
641bb0f6e9
feat: add custom allreduce kernels (#224)
|
11 months ago |
AlpinDale
|
c0aac15421
feat: S-LoRA support (#222)
|
11 months ago |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 months ago |
AlpinDale
|
f013d714c0
chore: merge dev branch into main (#177)
|
1 year ago |
AlpinDale
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
AlpinDale
|
8834ecf9de
chore: clean up refactor endpoints (#98)
|
1 year ago |
AlpinDale
|
c70abc7522
fix the LLM class for quantization
|
1 year ago |
AlpinDale
|
6b9561ef07
adapt TGI incremental detokenization
|
1 year ago |
AlpinDale
|
388d7545dd
fix: circular import
|
1 year ago |
AlpinDale
|
c761d38c69
fix: sort outputs and avoid unwanted list copy
|
1 year ago |
AlpinDale
|
56077f0f29
upstream: trust remote code
|
1 year ago |
AlpinDale
|
724852dc31
chore: refactoring cont.
|
1 year ago |
AlpinDale
|
5169163403
chore: add tokenizer mode for slow/fast tokenizers
|
1 year ago |
AlpinDale
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |