Commit History

Author SHA1 Message Date
  AlpinDale c2d77b1822 chore: logging refactor (#302) 11 months ago
  AlpinDale 132d9927cb fix: speedup runtime update script 11 months ago
  Stefan Gligorijevic 7380c2c3ff chore: update gxx to 11.3 (#282) 11 months ago
  Aykut Akgün cbe37e8b18 fix: speed up cuda home detection (#288) 11 months ago
  AlpinDale a98babfb74 fix: bnb on Turing GPUs (#299) 11 months ago
  AlpinDale 49793d7c5a fix: bump bnb kernels to sm_80 due to async stream copies 11 months ago
  AlpinDale 9810daa699 feat: INT8 KV Cache (#298) 11 months ago
  AlpinDale 82955ba440 fix: backport bnb kernels (#297) 11 months ago
  Pyroserenus 951077de65 chore: update klite.embd with current version (#296) 11 months ago
  sgsdxzy 94c1543cae fix: typo in marlin kernel path (#295) 11 months ago
  AlpinDale e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294) 11 months ago
  AlpinDale 705821a7fe feat: AQLM quantization support (#293) 11 months ago
  AlpinDale a1d8ab9f3e fix: lora on quantized models (barred gguf) (#292) 11 months ago
  AlpinDale 2d3d44b3e9 chore: add health check for ray workers (#290) 11 months ago
  AlpinDale ac82b67f75 feat: naive context shift and various QoL changes (#289) 11 months ago
  AlpinDale f35d15e632 fix: arg detection for kobold api launch (#286) 11 months ago
  AlpinDale 72229a94da feat: better marlin kernels (#285) 11 months ago
  AlpinDale 769b069e2e AttributeError fix in OpenAI server 11 months ago
  AlpinDale 23a7fd8cda remove ooba endpoint, fix and add deprecation warning for kobold endpoint, fix case where kobold endpoint was always launched with openai (#284) 11 months ago
  AlpinDale 13d850334e fix: navi support (#283) 11 months ago
  AlpinDale 9fa99215f8 feat: add cubic sampling (#280) 11 months ago
  AlpinDale 657aec0cbd refactor: OpenAI endpoint (#261) 11 months ago
  50h100a 3df36ee07d fix: logit bias logitproc (#278) 11 months ago
  AlpinDale 4d04ade9ef feat: fine-grained seeds (#279) 11 months ago
  Stefan Daniel Schwarz 810ca83066 fix+feat: docker compose (#264) 11 months ago
  AlpinDale 16615784b3 fix: prefix cache for turing gpus 11 months ago
  AlpinDale 7dc73a779a fix: properly perform garbage collection for lora (#277) 11 months ago
  AlpinDale 697c06c4f5 fix: LoRA support for mixtral (#276) 11 months ago
  AlpinDale 4b80b42362 fix: memory leaks due to nccl cuda graphs (#275) 11 months ago
  AlpinDale e31c6f0b45 feat: refactor modeling logic and support more models (#274) 11 months ago