Commit History

Autor SHA1 Mensaxe Data
  AlpinDale a985143768 core: add cuda graph support for encoder-decoder models (#1051) hai 1 semana
  AlpinDale 271879a4a5 fix: disable chunked prefill and prefix caching for multimodal models (#1037) hai 1 semana
  AlpinDale ddaefd8d38 chore: remove engine_use_ray (#1024) hai 1 semana
  AlpinDale fe01e2ded8 chore: move `device` keys to a constant (#1020) hai 2 semanas
  AlpinDale 9a42869055 chore: keep chunked prefill enabled with prefix caching (#1007) hai 2 semanas
  AlpinDale 145e554a4d neuron: add 8bit quantization for Neuron (#994) hai 2 semanas
  AlpinDale 510ae5b949 core: fix chunked prefill not being enabled by default for long contexts (#974) hai 2 semanas
  AlpinDale b3f6eeb1d2 vlm: increase the default `max_num_batched_tokens` for multimodal models (#973) hai 2 semanas
  AlpinDale 8d9f1fd4e6 feat: add single user mode (#927) hai 2 semanas
  AlpinDale f7f3fed265 feat: add async postprocessor (#925) hai 2 semanas
  AlpinDale 0c6d90dade neuron: add support for tensor parallelism (#923) hai 3 semanas
  AlpinDale 22a4cd4595 core: fix spec decode metrics and envs circular import (#889) hai 3 semanas
  AlpinDale 901900854e chore: consolidate environment variables within one file (#882) hai 4 semanas
  AlpinDale 48a8693aed feat: multi-step scheduling (#831) hai 1 mes
  AlpinDale 2f61644f6e SPMD optimizations (#824) hai 1 mes
  AlpinDale f088ea81c7 fix: --max-seq-len-to-capture arg (#818) hai 1 mes
  AlpinDale 0256ed236b feat: windows support (#790) hai 2 meses
  AlpinDale dcb794a340 fix: revert incorrect commit hai 2 meses
  AlpinDale 76367b5ae7 wip hai 2 meses
  AlpinDale 7222b84582 feat: ministral support (#776) hai 2 meses
  AlpinDale 73177656ed feat: quant_llm support (#755) hai 3 meses
  AlpinDale 89a2c6dee1 chore: refactor `MultiModalConfig` initialization and profiling (#745) hai 3 meses
  AlpinDale 28b6397188 chore: quant config for speculative draft models (#719) hai 3 meses
  AlpinDale 008e646c7e chore: add support for up to 2048 block size (#715) hai 3 meses
  AlpinDale 577586309d chore: multi-step args and sequence modifications (#713) hai 3 meses
  AlpinDale 0b8b407b6d feat: support profiling with multiple multi-modal inputs per prompt (#712) hai 3 meses
  AlpinDale d5033e12fd feat: implement mistral tokenizer mode (#711) hai 3 meses
  AlpinDale 4fe371b7fa fix: allow passing float for GiB arguments (#690) hai 4 meses
  AlpinDale bf88c8567e feat: mamba model support (#674) hai 4 meses
  AlpinDale a0e446a17d feat: initial encoder-decoder support with BART model (#633) hai 4 meses