Commit History

Автор SHA1 Съобщение Дата
  AlpinDale b9b5e352cb typo преди 3 месеца
  AlpinDale 0e08cb1c12 add ultravox config file преди 3 месеца
  AlpinDale e81b8d1c52 move input parsing to utils преди 3 месеца
  AlpinDale 3693028340 feat: support for Audio modality (#698) преди 3 месеца
  AlpinDale 31483a7d3b fix: manually install triton for other devices to prevent outlines errors (#697) преди 3 месеца
  AlpinDale 5d37ec1016 suppress tpu import warning (#696) преди 3 месеца
  AlpinDale 0e558e9b2f fix: loading chameleon model with TP>1 (#695) преди 3 месеца
  AlpinDale 1d3a1fec47 feat: add load/unload endpoints for soft-prompts (#694) преди 3 месеца
  AlpinDale c34a6ac8e4 feat: add lora loading/unloading api endpoint (#693) преди 3 месеца
  AlpinDale 7debd35ca2 fix: shut down ray dag workers cleanly (#692) преди 3 месеца
  AlpinDale ec32f999bc build: bump cmake to 3.26 (#691) преди 3 месеца
  AlpinDale 4fe371b7fa fix: allow passing float for GiB arguments (#690) преди 4 месеца
  AlpinDale 6144150398 chore: use scalar type to dispatch to different `gptq_marlin` kernels (#689) преди 4 месеца
  AlpinDale 24456206a9 fix: logit softcapping in flash-attn (#688) преди 4 месеца
  AlpinDale f3bfdfb923 chore: use public ECR for neuron image (#687) преди 4 месеца
  AlpinDale 3f712cd287 feat: add progress bar for loading individual weight modules (#640) преди 4 месеца
  AlpinDale 2573b36f6a feat: allow image embeddings for VLM input (#686) преди 4 месеца
  AlpinDale 300f889554 chore: update flashinfer to v0.1.3 (#685) преди 4 месеца
  AlpinDale 4ca9aaaf3c build: add empty device (#684) преди 4 месеца
  AlpinDale b03fa02397 refactor: base worker input refactor for multi-step (#683) преди 4 месеца
  AlpinDale 8cfbe62a7c chore: bump lmfe to v0.10.6 and include triton for tpu and xpu dockerfiles (#682) преди 4 месеца
  AlpinDale 06cd48ea5c chore: use mark_dynamic to reduce TPU compile times (#681) преди 4 месеца
  AlpinDale fa5553b20f fix: phi3v batch inference with different aspect ratio images (#680) преди 4 месеца
  AlpinDale 79d603954e fix: chunked prefill with v2 block manager (#679) преди 4 месеца
  AlpinDale 3bbb3f2086 feat: add numpy implementation of `compute_slot_mapping` (#678) преди 4 месеца
  AlpinDale df208ab4e9 fix: fp8 checkpoints with fused linear modules (#677) преди 4 месеца
  AlpinDale 81fa31bcaf feat: embeddings support for batched OAI endpoint (#676) преди 4 месеца
  AlpinDale c2bb886b2e fix: reinit procedure in `ModelInputForGPUBuilder` (#675) преди 4 месеца
  AlpinDale bf88c8567e feat: mamba model support (#674) преди 4 месеца
  AlpinDale 8583aefed7 chore: mamba cache single buffer (#673) преди 4 месеца