Historique des commits

Auteur SHA1 Message Date
  AlpinDale fc5ef786b0 Merge branch 'main' into lm_head_lora il y a 2 semaines
  AlpinDale c90abcc603 VLM: add pipeline parallelism support for Qwen2-VL (#1103) il y a 2 semaines
  AlpinDale cc5e185795 VLM: support passing multimodal processor kwargs (#1102) il y a 2 semaines
  AlpinDale 8e7d214d2d Merge branch 'main' into lm_head_lora il y a 1 mois
  AlpinDale 92cee435e2 rocm: add more quants, fix _scaled_mm call (#1062) il y a 1 mois
  AlpinDale b3f9ab3b72 quant: add tensor parallel support for bitsandbytes (#1052) il y a 1 mois
  AlpinDale a985143768 core: add cuda graph support for encoder-decoder models (#1051) il y a 1 mois
  AlpinDale 4b1b658855 tpu: implement multi-step scheduling (#1046) il y a 1 mois
  AlpinDale ddaefd8d38 chore: remove engine_use_ray (#1024) il y a 1 mois
  AlpinDale f2b6dc3872 cpu: add support for W8A8 quantization via compressed-tensor (#1017) il y a 1 mois
  AlpinDale 411ac4f405 vlm: add support for Qwen2-VL model (#1015) il y a 1 mois
  AlpinDale dcb36de9c4 quants: add support for NVIDIA's ModelOpt checkpoints (#1013) il y a 1 mois
  AlpinDale 30d02d0747 chore: remove peft as a requirement (#1006) il y a 1 mois
  AlpinDale 145e554a4d neuron: add 8bit quantization for Neuron (#994) il y a 1 mois
  AlpinDale b3f6eeb1d2 vlm: increase the default `max_num_batched_tokens` for multimodal models (#973) il y a 1 mois
  AlpinDale 5bec8fbb1b tpu: add support for async postprocessing (#968) il y a 1 mois
  AlpinDale a8bdd488b9 distributed: support pipeline parallelism for internvl and internlm2 (#965) il y a 1 mois
  AlpinDale fcfcfc65e1 quants: add triton kernels for AWQ (#946) il y a 1 mois
  AlpinDale 8d9f1fd4e6 feat: add single user mode (#927) il y a 1 mois
  AlpinDale f7f3fed265 feat: add async postprocessor (#925) il y a 1 mois
  AlpinDale 132aa2abe4 spec decode: add support for EAGLE (#899) il y a 1 mois
  AlpinDale 908ff753a1 fix: phi_3.5_v loading (#896) il y a 1 mois
  AlpinDale 22a4cd4595 core: fix spec decode metrics and envs circular import (#889) il y a 1 mois
  AlpinDale 80e7f3b0bd feat: support finetuned lm_head and embed_tokens in LoRA adapters il y a 1 mois
  AlpinDale 901900854e chore: consolidate environment variables within one file (#882) il y a 2 mois
  AlpinDale 9288a98084 spec decoding: set the draft model ctxlen to target model (#874) il y a 2 mois
  AlpinDale 483c9e6e59 fix: disable awq_marlin override for awq models (#843) il y a 2 mois
  AlpinDale 2f61644f6e SPMD optimizations (#824) il y a 2 mois
  AlpinDale 0a369f9171 feat: support chunked prefill with LoRA (#823) il y a 2 mois
  AlpinDale c6c91edab7 ci: update & overhaul test units (#769) il y a 2 mois