Commit History

Author SHA1 Message Date
  AlpinDale 86bf2cc4f3 core: rename `PromptInputs,inputs` -> `PromptType,prompt` (#1080) 1 day ago
  AlpinDale 1264e0b5d8 api: add mistral function calling format to all models loaded with "mistral" format (#1053) 1 week ago
  AlpinDale a985143768 core: add cuda graph support for encoder-decoder models (#1051) 1 week ago
  AlpinDale 055c8905a3 api: add sampling/engine option to return only deltas or final output (#1035) 1 week ago
  AlpinDale f644e10449 vlm: enable multimodal inputs for the LLM class (#992) 2 weeks ago
  AlpinDale f7f3fed265 feat: add async postprocessor (#925) 2 weeks ago
  AlpinDale f797294b29 fix: `add_generation_template` -> `add_generation_prompt` in llm (#877) 4 weeks ago
  AlpinDale e5b1afe625 feat: add chat method for LLM class (#822) 1 month ago
  AlpinDale 62111fab17 feat: allow serving encoder-decoder models in the API server (#664) 4 months ago
  AlpinDale a0e446a17d feat: initial encoder-decoder support with BART model (#633) 4 months ago
  AlpinDale f1d0b77c92 [0.6.0] Release Candidate (#481) 4 months ago
  AlpinDale 9d81716bfd [v0.5.3] Release Candidate (#388) 8 months ago
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 9 months ago
  AlpinDale e42a78381a feat: switch from pylint to ruff (#322) 10 months ago
  AlpinDale ac82b67f75 feat: naive context shift and various QoL changes (#289) 10 months ago
  AlpinDale c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 11 months ago
  AlpinDale 641bb0f6e9 feat: add custom allreduce kernels (#224) 11 months ago
  AlpinDale c0aac15421 feat: S-LoRA support (#222) 11 months ago
  AlpinDale 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 11 months ago
  AlpinDale f013d714c0 chore: merge dev branch into main (#177) 1 year ago
  AlpinDale 2755a48d51 merge dev branch into main (#153) 1 year ago
  AlpinDale 8834ecf9de chore: clean up refactor endpoints (#98) 1 year ago
  AlpinDale c70abc7522 fix the LLM class for quantization 1 year ago
  AlpinDale 6b9561ef07 adapt TGI incremental detokenization 1 year ago
  AlpinDale 388d7545dd fix: circular import 1 year ago
  AlpinDale c761d38c69 fix: sort outputs and avoid unwanted list copy 1 year ago
  AlpinDale 56077f0f29 upstream: trust remote code 1 year ago
  AlpinDale 724852dc31 chore: refactoring cont. 1 year ago
  AlpinDale 5169163403 chore: add tokenizer mode for slow/fast tokenizers 1 year ago
  AlpinDale 07aa2a492f upstream: add option to specify tokenizer 1 year ago