AlpinDale
|
31092ad5ae
fix: issues template
|
10 months ago |
AlpinDale
|
e544814a92
feat: add issue template and an env info collector (#321)
|
10 months ago |
AlpinDale
|
89c32b40ec
chore: add new imatrix quants (#320)
|
10 months ago |
sgsdxzy
|
50c0875c32
chore: log total memory usage (#316)
|
10 months ago |
AlpinDale
|
e82b654ddd
readme: add tabby, fix docker, add colab (#315)
|
10 months ago |
AlpinDale
|
fa07e6db61
docker: build docker for all CUDA arches
|
10 months ago |
drummerv
|
e59dd4a90d
fix: openai gguf chat template (#312)
|
10 months ago |
AlpinDale
|
b3df2351c8
readme: update with bsz1 graph
|
10 months ago |
AlpinDale
|
434dc19961
CI: fix build failure for cuda versions with no torch wheels
|
10 months ago |
AlpinDale
|
968bde81bf
fix: tensor parallel with GPTQ and AWQ quants (#307)
|
10 months ago |
AlpinDale
|
ff898c2c80
bump version to 0.5.0 (#303)
|
10 months ago |
AlpinDale
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
10 months ago |
AlpinDale
|
3a045ebfde
fix: escape tags in loguru (#304)
|
10 months ago |
AlpinDale
|
9ec611090d
chore: build for more cuda versions
|
10 months ago |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
10 months ago |
AlpinDale
|
132d9927cb
fix: speedup runtime update script
|
10 months ago |
Stefan Gligorijevic
|
7380c2c3ff
chore: update gxx to 11.3 (#282)
|
10 months ago |
Aykut Akgün
|
cbe37e8b18
fix: speed up cuda home detection (#288)
|
10 months ago |
AlpinDale
|
a98babfb74
fix: bnb on Turing GPUs (#299)
|
10 months ago |
AlpinDale
|
49793d7c5a
fix: bump bnb kernels to sm_80 due to async stream copies
|
10 months ago |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 months ago |
AlpinDale
|
82955ba440
fix: backport bnb kernels (#297)
|
10 months ago |
Pyroserenus
|
951077de65
chore: update klite.embd with current version (#296)
|
10 months ago |
sgsdxzy
|
94c1543cae
fix: typo in marlin kernel path (#295)
|
10 months ago |
AlpinDale
|
e0c35bb353
feat: bitsandbytes and `--load-in{4,8}bit` support (#294)
|
10 months ago |
AlpinDale
|
705821a7fe
feat: AQLM quantization support (#293)
|
10 months ago |
AlpinDale
|
a1d8ab9f3e
fix: lora on quantized models (barred gguf) (#292)
|
10 months ago |
AlpinDale
|
2d3d44b3e9
chore: add health check for ray workers (#290)
|
10 months ago |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
10 months ago |
AlpinDale
|
f35d15e632
fix: arg detection for kobold api launch (#286)
|
10 months ago |