Commit History

Autor SHA1 Mensaxe Data
  AlpinDale 82955ba440 fix: backport bnb kernels (#297) hai 1 ano
  Pyroserenus 951077de65 chore: update klite.embd with current version (#296) hai 1 ano
  sgsdxzy 94c1543cae fix: typo in marlin kernel path (#295) hai 1 ano
  AlpinDale e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294) hai 1 ano
  AlpinDale 705821a7fe feat: AQLM quantization support (#293) hai 1 ano
  AlpinDale a1d8ab9f3e fix: lora on quantized models (barred gguf) (#292) hai 1 ano
  AlpinDale 2d3d44b3e9 chore: add health check for ray workers (#290) hai 1 ano
  AlpinDale ac82b67f75 feat: naive context shift and various QoL changes (#289) hai 1 ano
  AlpinDale f35d15e632 fix: arg detection for kobold api launch (#286) hai 1 ano
  AlpinDale 72229a94da feat: better marlin kernels (#285) hai 1 ano
  AlpinDale 769b069e2e AttributeError fix in OpenAI server hai 1 ano
  AlpinDale 23a7fd8cda remove ooba endpoint, fix and add deprecation warning for kobold endpoint, fix case where kobold endpoint was always launched with openai (#284) hai 1 ano
  AlpinDale 13d850334e fix: navi support (#283) hai 1 ano
  AlpinDale 9fa99215f8 feat: add cubic sampling (#280) hai 1 ano
  AlpinDale 657aec0cbd refactor: OpenAI endpoint (#261) hai 1 ano
  50h100a 3df36ee07d fix: logit bias logitproc (#278) hai 1 ano
  AlpinDale 4d04ade9ef feat: fine-grained seeds (#279) hai 1 ano
  Stefan Daniel Schwarz 810ca83066 fix+feat: docker compose (#264) hai 1 ano
  AlpinDale 16615784b3 fix: prefix cache for turing gpus hai 1 ano
  AlpinDale 7dc73a779a fix: properly perform garbage collection for lora (#277) hai 1 ano
  AlpinDale 697c06c4f5 fix: LoRA support for mixtral (#276) hai 1 ano
  AlpinDale 4b80b42362 fix: memory leaks due to nccl cuda graphs (#275) hai 1 ano
  AlpinDale e31c6f0b45 feat: refactor modeling logic and support more models (#274) hai 1 ano
  AlpinDale 7d6ba53602 feat: fused top-k kernels for MoE (#273) hai 1 ano
  AlpinDale a3cab09b69 chore: logging env variable hai 1 ano
  AlpinDale 2c08aa5af4 chore: remove eos token from output (#272) hai 1 ano
  AlpinDale 8e1cd54497 fix: do not include fp8 for rocm (#271) hai 1 ano
  AlpinDale 6a63ab4ec3 fix: remote encode request if using ray (#270) hai 1 ano
  AlpinDale 224b87b484 feat: add fused mixtral moe support (#238) hai 1 ano
  Thomas Xin 43cf0e98a0 fix: worker initialization on WSL (#260) hai 1 ano