Commit History

Author SHA1 Message Date
  AlpinDale 31092ad5ae fix: issues template 10 months ago
  AlpinDale e544814a92 feat: add issue template and an env info collector (#321) 10 months ago
  AlpinDale 89c32b40ec chore: add new imatrix quants (#320) 10 months ago
  sgsdxzy 50c0875c32 chore: log total memory usage (#316) 10 months ago
  AlpinDale e82b654ddd readme: add tabby, fix docker, add colab (#315) 10 months ago
  AlpinDale fa07e6db61 docker: build docker for all CUDA arches 10 months ago
  drummerv e59dd4a90d fix: openai gguf chat template (#312) 10 months ago
  AlpinDale b3df2351c8 readme: update with bsz1 graph 10 months ago
  AlpinDale 434dc19961 CI: fix build failure for cuda versions with no torch wheels 10 months ago
  AlpinDale 968bde81bf fix: tensor parallel with GPTQ and AWQ quants (#307) 10 months ago
  AlpinDale ff898c2c80 bump version to 0.5.0 (#303) 10 months ago
  AlpinDale c41462cfcd feat: exllamav2 quantization (#305) 10 months ago
  AlpinDale 3a045ebfde fix: escape tags in loguru (#304) 10 months ago
  AlpinDale 9ec611090d chore: build for more cuda versions 10 months ago
  AlpinDale c2d77b1822 chore: logging refactor (#302) 10 months ago
  AlpinDale 132d9927cb fix: speedup runtime update script 10 months ago
  Stefan Gligorijevic 7380c2c3ff chore: update gxx to 11.3 (#282) 10 months ago
  Aykut Akgün cbe37e8b18 fix: speed up cuda home detection (#288) 10 months ago
  AlpinDale a98babfb74 fix: bnb on Turing GPUs (#299) 10 months ago
  AlpinDale 49793d7c5a fix: bump bnb kernels to sm_80 due to async stream copies 10 months ago
  AlpinDale 9810daa699 feat: INT8 KV Cache (#298) 10 months ago
  AlpinDale 82955ba440 fix: backport bnb kernels (#297) 10 months ago
  Pyroserenus 951077de65 chore: update klite.embd with current version (#296) 10 months ago
  sgsdxzy 94c1543cae fix: typo in marlin kernel path (#295) 10 months ago
  AlpinDale e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294) 10 months ago
  AlpinDale 705821a7fe feat: AQLM quantization support (#293) 10 months ago
  AlpinDale a1d8ab9f3e fix: lora on quantized models (barred gguf) (#292) 10 months ago
  AlpinDale 2d3d44b3e9 chore: add health check for ray workers (#290) 10 months ago
  AlpinDale ac82b67f75 feat: naive context shift and various QoL changes (#289) 10 months ago
  AlpinDale f35d15e632 fix: arg detection for kobold api launch (#286) 10 months ago