Commitin historia

Tekijä SHA1 Viesti Päivämäärä
  AlpinDale cda0e93a10 abstract away the platform for device capability 7 kuukautta sitten
  AlpinDale ae04f57ec1 feat: Pipeline Parallel support (#581) 7 kuukautta sitten
  AlpinDale 7d79c0e726 chore: use nvml query to avoid accidental cuda initialization 7 kuukautta sitten
  AlpinDale cdff8e89f9 feat: introduce `DraftModelRunner` 7 kuukautta sitten
  AlpinDale 405bb74612 Control plane comms refactor (#573) 7 kuukautta sitten
  AlpinDale 25feb1d592 chore: add support for pinning lora adapters in the lru cache 7 kuukautta sitten
  AlpinDale af43576da0 feat: add MLPSpeculator speculative decoding support (#572) 7 kuukautta sitten
  AlpinDale 6a57861fca feat: initial XPU support via intel_extension_for_pytorch (#571) 7 kuukautta sitten
  AlpinDale d0cca80b8b feat: support sharded tensorizer models 7 kuukautta sitten
  AlpinDale eb2c5c77df feat: enforce the max possible seqlen 8 kuukautta sitten
  AlpinDale de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 8 kuukautta sitten
  AlpinDale 236be273e5 feat: tensor parallel speculative decoding (#554) 8 kuukautta sitten
  AlpinDale 7bcff4ac03 implement sharded state dict 8 kuukautta sitten
  AlpinDale b984fe4a91 refactor custom allreduce to support multiple tp groups 8 kuukautta sitten
  AlpinDale be8154a8a0 feat: proper embeddings API with e5-mistral-7b support 8 kuukautta sitten
  AlpinDale 8ae2cce237 refactor pynccl 8 kuukautta sitten
  AlpinDale 0e062e66d3 set block size at init 8 kuukautta sitten
  AlpinDale 8b56dc4347 dict -> torch.Tensor for blocks_to_swap 8 kuukautta sitten
  AlpinDale 21ce19b3ea blocks_to_copy dict -> torch.Tensor 8 kuukautta sitten
  AlpinDale ef733aee43 implement ExecuteModelData to reduce executor complexity 8 kuukautta sitten
  AlpinDale 1879e32510 enable all-reduce for multiple tp groups 8 kuukautta sitten
  AlpinDale 46159b107a formatting: pt1 8 kuukautta sitten
  AlpinDale 4c746d8baa chore: init nccl using the gloo backend 8 kuukautta sitten
  AlpinDale fca911ee0a vLLM Upstream Sync (#526) 8 kuukautta sitten
  AlpinDale f894f7b176 Revert "reduce dedupe by wrapping in general worker class" 10 kuukautta sitten
  AlpinDale 9fff6fb507 reduce dedupe by wrapping in general worker class 10 kuukautta sitten
  AlpinDale 9d81716bfd [v0.5.3] Release Candidate (#388) 10 kuukautta sitten
  AlpinDale e3252edd07 fix: remove event and stream, add typing (#382) 11 kuukautta sitten
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 11 kuukautta sitten
  AlpinDale e42a78381a feat: switch from pylint to ruff (#322) 1 vuosi sitten