コミット履歴

作者 SHA1 メッセージ 日付
  AlpinDale a4cbcfe59f feat: disable logprob serialization to CPU for spec decode 5 ヶ月 前
  AlpinDale 45a004874c chore: allow specifying custom Executor 5 ヶ月 前
  AlpinDale 6671e3a162 feat: add CPU offloading support (#598) 5 ヶ月 前
  AlpinDale cf381a0c54 OpenAI API Refactor (#591) 5 ヶ月 前
  AlpinDale 99680b2d23 feat: soft prompts (#589) 5 ヶ月 前
  AlpinDale 4f7d212b70 feat: remove vision language config 5 ヶ月 前
  AlpinDale 3a0fdf7b9b chore: remove `image_input_type` from VLM config 5 ヶ月 前
  AlpinDale 7253e9052d feat: integrate typical acceptance sampling for spec decoding 5 ヶ月 前
  AlpinDale b8a19ba27f chore: extend aphrodite metrics logging api 5 ヶ月 前
  AlpinDale 0886c361f4 feat: OpenVINO CPU backend (#576) 5 ヶ月 前
  AlpinDale c0c336aaa3 refactor: registry for processing model inputs; quick_gelu; clip model support 5 ヶ月 前
  AlpinDale 80ac1cdc8f fix: add args for the draft tp 5 ヶ月 前
  AlpinDale d44ac8e497 fix: `--preemption_mode` -> `--preemption-mode` 6 ヶ月 前
  AlpinDale 6a57861fca feat: initial XPU support via intel_extension_for_pytorch (#571) 6 ヶ月 前
  AlpinDale fe21123a1c feat: TPU support (#570) 6 ヶ月 前
  AlpinDale fa58ba87a3 fix: only set executor backend to mp if not multi-node 6 ヶ月 前
  AlpinDale bba89fc6d3 chore: make the automatic rope scaling behave properly with rope_scaling arg, add rope theta 6 ヶ月 前
  AlpinDale c61d9f1aa3 fix: lora_dtype value in args 6 ヶ月 前
  AlpinDale ec5b99d075 fix: use named args 6 ヶ月 前
  AlpinDale 237fa59aea feat: support CPU/GPU swapping in BlockManagerV2 6 ヶ月 前
  AlpinDale 8d77c69cbd feat: support image processor and add llava example 6 ヶ月 前
  AlpinDale 690110a051 feat: bitsandbytes quantization 6 ヶ月 前
  AlpinDale f40b809d3b allow using v2 block manager with sliding window 6 ヶ月 前
  AlpinDale ac79d115b3 add guards for prefix caching, fp8, chunked, etc 6 ヶ月 前
  AlpinDale f6250c5516 move dockerfiles to root; fix cpu build 6 ヶ月 前
  AlpinDale 4e1ae004da make mp the default distributed backend 6 ヶ月 前
  AlpinDale 656459fd84 make fp8_e4m3 work on nvidia 6 ヶ月 前
  AlpinDale 60e74e92fd add rope_scaling arg 6 ヶ月 前
  AlpinDale 9e73559eba make use of batched rotary embedding kernels to support long context lora 6 ヶ月 前
  AlpinDale 7bcff4ac03 implement sharded state dict 6 ヶ月 前