提交历史

作者 SHA1 备注 提交日期
  AlpinDale cda0e93a10 abstract away the platform for device capability 7 月之前
  AlpinDale ae04f57ec1 feat: Pipeline Parallel support (#581) 7 月之前
  AlpinDale 7d79c0e726 chore: use nvml query to avoid accidental cuda initialization 7 月之前
  AlpinDale cdff8e89f9 feat: introduce `DraftModelRunner` 7 月之前
  AlpinDale 405bb74612 Control plane comms refactor (#573) 7 月之前
  AlpinDale 25feb1d592 chore: add support for pinning lora adapters in the lru cache 7 月之前
  AlpinDale af43576da0 feat: add MLPSpeculator speculative decoding support (#572) 7 月之前
  AlpinDale 6a57861fca feat: initial XPU support via intel_extension_for_pytorch (#571) 7 月之前
  AlpinDale d0cca80b8b feat: support sharded tensorizer models 7 月之前
  AlpinDale eb2c5c77df feat: enforce the max possible seqlen 8 月之前
  AlpinDale de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 8 月之前
  AlpinDale 236be273e5 feat: tensor parallel speculative decoding (#554) 8 月之前
  AlpinDale 7bcff4ac03 implement sharded state dict 8 月之前
  AlpinDale b984fe4a91 refactor custom allreduce to support multiple tp groups 8 月之前
  AlpinDale be8154a8a0 feat: proper embeddings API with e5-mistral-7b support 8 月之前
  AlpinDale 8ae2cce237 refactor pynccl 8 月之前
  AlpinDale 0e062e66d3 set block size at init 8 月之前
  AlpinDale 8b56dc4347 dict -> torch.Tensor for blocks_to_swap 8 月之前
  AlpinDale 21ce19b3ea blocks_to_copy dict -> torch.Tensor 8 月之前
  AlpinDale ef733aee43 implement ExecuteModelData to reduce executor complexity 8 月之前
  AlpinDale 1879e32510 enable all-reduce for multiple tp groups 8 月之前
  AlpinDale 46159b107a formatting: pt1 9 月之前
  AlpinDale 4c746d8baa chore: init nccl using the gloo backend 9 月之前
  AlpinDale fca911ee0a vLLM Upstream Sync (#526) 9 月之前
  AlpinDale f894f7b176 Revert "reduce dedupe by wrapping in general worker class" 10 月之前
  AlpinDale 9fff6fb507 reduce dedupe by wrapping in general worker class 10 月之前
  AlpinDale 9d81716bfd [v0.5.3] Release Candidate (#388) 10 月之前
  AlpinDale e3252edd07 fix: remove event and stream, add typing (#382) 11 月之前
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 1 年之前
  AlpinDale e42a78381a feat: switch from pylint to ruff (#322) 1 年之前