Commit History

Author SHA1 Message Date
  AlpinDale 82955ba440 fix: backport bnb kernels (#297) 1 year ago
  Pyroserenus 951077de65 chore: update klite.embd with current version (#296) 1 year ago
  sgsdxzy 94c1543cae fix: typo in marlin kernel path (#295) 1 year ago
  AlpinDale e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294) 1 year ago
  AlpinDale 705821a7fe feat: AQLM quantization support (#293) 1 year ago
  AlpinDale a1d8ab9f3e fix: lora on quantized models (barred gguf) (#292) 1 year ago
  AlpinDale 2d3d44b3e9 chore: add health check for ray workers (#290) 1 year ago
  AlpinDale ac82b67f75 feat: naive context shift and various QoL changes (#289) 1 year ago
  AlpinDale f35d15e632 fix: arg detection for kobold api launch (#286) 1 year ago
  AlpinDale 72229a94da feat: better marlin kernels (#285) 1 year ago
  AlpinDale 769b069e2e AttributeError fix in OpenAI server 1 year ago
  AlpinDale 23a7fd8cda remove ooba endpoint, fix and add deprecation warning for kobold endpoint, fix case where kobold endpoint was always launched with openai (#284) 1 year ago
  AlpinDale 13d850334e fix: navi support (#283) 1 year ago
  AlpinDale 9fa99215f8 feat: add cubic sampling (#280) 1 year ago
  AlpinDale 657aec0cbd refactor: OpenAI endpoint (#261) 1 year ago
  50h100a 3df36ee07d fix: logit bias logitproc (#278) 1 year ago
  AlpinDale 4d04ade9ef feat: fine-grained seeds (#279) 1 year ago
  Stefan Daniel Schwarz 810ca83066 fix+feat: docker compose (#264) 1 year ago
  AlpinDale 16615784b3 fix: prefix cache for turing gpus 1 year ago
  AlpinDale 7dc73a779a fix: properly perform garbage collection for lora (#277) 1 year ago
  AlpinDale 697c06c4f5 fix: LoRA support for mixtral (#276) 1 year ago
  AlpinDale 4b80b42362 fix: memory leaks due to nccl cuda graphs (#275) 1 year ago
  AlpinDale e31c6f0b45 feat: refactor modeling logic and support more models (#274) 1 year ago
  AlpinDale 7d6ba53602 feat: fused top-k kernels for MoE (#273) 1 year ago
  AlpinDale a3cab09b69 chore: logging env variable 1 year ago
  AlpinDale 2c08aa5af4 chore: remove eos token from output (#272) 1 year ago
  AlpinDale 8e1cd54497 fix: do not include fp8 for rocm (#271) 1 year ago
  AlpinDale 6a63ab4ec3 fix: remote encode request if using ray (#270) 1 year ago
  AlpinDale 224b87b484 feat: add fused mixtral moe support (#238) 1 year ago
  Thomas Xin 43cf0e98a0 fix: worker initialization on WSL (#260) 1 year ago