Histórico de Commits

Autor SHA1 Mensagem Data
  AlpinDale a985143768 core: add cuda graph support for encoder-decoder models (#1051) há 1 semana atrás
  AlpinDale 271879a4a5 fix: disable chunked prefill and prefix caching for multimodal models (#1037) há 1 semana atrás
  AlpinDale ddaefd8d38 chore: remove engine_use_ray (#1024) há 1 semana atrás
  AlpinDale fe01e2ded8 chore: move `device` keys to a constant (#1020) há 2 semanas atrás
  AlpinDale 9a42869055 chore: keep chunked prefill enabled with prefix caching (#1007) há 2 semanas atrás
  AlpinDale 145e554a4d neuron: add 8bit quantization for Neuron (#994) há 2 semanas atrás
  AlpinDale 510ae5b949 core: fix chunked prefill not being enabled by default for long contexts (#974) há 2 semanas atrás
  AlpinDale b3f6eeb1d2 vlm: increase the default `max_num_batched_tokens` for multimodal models (#973) há 2 semanas atrás
  AlpinDale 8d9f1fd4e6 feat: add single user mode (#927) há 2 semanas atrás
  AlpinDale f7f3fed265 feat: add async postprocessor (#925) há 2 semanas atrás
  AlpinDale 0c6d90dade neuron: add support for tensor parallelism (#923) há 3 semanas atrás
  AlpinDale 22a4cd4595 core: fix spec decode metrics and envs circular import (#889) há 3 semanas atrás
  AlpinDale 901900854e chore: consolidate environment variables within one file (#882) há 4 semanas atrás
  AlpinDale 48a8693aed feat: multi-step scheduling (#831) há 1 mês atrás
  AlpinDale 2f61644f6e SPMD optimizations (#824) há 1 mês atrás
  AlpinDale f088ea81c7 fix: --max-seq-len-to-capture arg (#818) há 1 mês atrás
  AlpinDale 0256ed236b feat: windows support (#790) há 2 meses atrás
  AlpinDale dcb794a340 fix: revert incorrect commit há 2 meses atrás
  AlpinDale 76367b5ae7 wip há 2 meses atrás
  AlpinDale 7222b84582 feat: ministral support (#776) há 2 meses atrás
  AlpinDale 73177656ed feat: quant_llm support (#755) há 3 meses atrás
  AlpinDale 89a2c6dee1 chore: refactor `MultiModalConfig` initialization and profiling (#745) há 3 meses atrás
  AlpinDale 28b6397188 chore: quant config for speculative draft models (#719) há 3 meses atrás
  AlpinDale 008e646c7e chore: add support for up to 2048 block size (#715) há 3 meses atrás
  AlpinDale 577586309d chore: multi-step args and sequence modifications (#713) há 3 meses atrás
  AlpinDale 0b8b407b6d feat: support profiling with multiple multi-modal inputs per prompt (#712) há 3 meses atrás
  AlpinDale d5033e12fd feat: implement mistral tokenizer mode (#711) há 3 meses atrás
  AlpinDale 4fe371b7fa fix: allow passing float for GiB arguments (#690) há 4 meses atrás
  AlpinDale bf88c8567e feat: mamba model support (#674) há 4 meses atrás
  AlpinDale a0e446a17d feat: initial encoder-decoder support with BART model (#633) há 4 meses atrás