História revízii

Autor SHA1 Správa Dátum
  AlpinDale 6671e3a162 feat: add CPU offloading support (#598) 5 mesiacov pred
  AlpinDale 22305c91e9 refactor _prepare_model_input_tensor and attn metadata builder for most backends 5 mesiacov pred
  AlpinDale 5289c14b24 feat: Asymmetric Tensor Parallel (#594) 5 mesiacov pred
  AlpinDale 99680b2d23 feat: soft prompts (#589) 5 mesiacov pred
  AlpinDale c11a8bdaad fix: calculate max number of multi-modal tokens automatically 5 mesiacov pred
  AlpinDale 151d782233 fix: attention softcapping for flashinfer 5 mesiacov pred
  AlpinDale 4f7d212b70 feat: remove vision language config 5 mesiacov pred
  AlpinDale 4599c98f99 feat: dynamic image size support for VLMs 5 mesiacov pred
  AlpinDale 5be90c3859 Mamba infrastrucuture support (#586) 5 mesiacov pred
  AlpinDale ae04f57ec1 feat: Pipeline Parallel support (#581) 5 mesiacov pred
  AlpinDale 3a0fdf7b9b chore: remove `image_input_type` from VLM config 5 mesiacov pred
  AlpinDale b6e60143e7 Flashinfer for prefill phase (#580) 5 mesiacov pred
  AlpinDale cdff8e89f9 feat: introduce `DraftModelRunner` 5 mesiacov pred
  AlpinDale c0c336aaa3 refactor: registry for processing model inputs; quick_gelu; clip model support 5 mesiacov pred
  AlpinDale 56e0b8223c chore: add base class for LoRA-supported models 5 mesiacov pred
  AlpinDale dead030abf fix: cuda graph with MLPSpeculator 5 mesiacov pred
  AlpinDale 405bb74612 Control plane comms refactor (#573) 5 mesiacov pred
  AlpinDale 25feb1d592 chore: add support for pinning lora adapters in the lru cache 5 mesiacov pred
  AlpinDale af43576da0 feat: add MLPSpeculator speculative decoding support (#572) 5 mesiacov pred
  AlpinDale 34b41e0a87 chore: add coordinator to reduce code duplication in tp and pp 6 mesiacov pred
  AlpinDale d0cca80b8b feat: support sharded tensorizer models 6 mesiacov pred
  AlpinDale 4d1e613804 chore: minor simplifications 6 mesiacov pred
  AlpinDale 6cecbbff6a fix: reduce memory footprint of cuda graph by adding output buffer 6 mesiacov pred
  AlpinDale c975bba905 fix: sharded state loader with lora 6 mesiacov pred
  AlpinDale e321d80e4e fix: `prompt_logprobs==0` case 6 mesiacov pred
  AlpinDale 8d77c69cbd feat: support image processor and add llava example 6 mesiacov pred
  AlpinDale 08f639b8aa remove duplicate seq_lens_tensor 6 mesiacov pred
  AlpinDale f40b809d3b allow using v2 block manager with sliding window 6 mesiacov pred
  AlpinDale 5b0c11d190 support pipeline parallel pynccl groups 6 mesiacov pred
  AlpinDale de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead 6 mesiacov pred