AlpinDale
|
2b5af25923
add perplexity example
|
10 months ago |
drummerv
|
e59dd4a90d
fix: openai gguf chat template (#312)
|
10 months ago |
AlpinDale
|
b3df2351c8
readme: update with bsz1 graph
|
10 months ago |
AlpinDale
|
434dc19961
CI: fix build failure for cuda versions with no torch wheels
|
10 months ago |
AlpinDale
|
968bde81bf
fix: tensor parallel with GPTQ and AWQ quants (#307)
|
10 months ago |
AlpinDale
|
ff898c2c80
bump version to 0.5.0 (#303)
|
10 months ago |
AlpinDale
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
10 months ago |
AlpinDale
|
3a045ebfde
fix: escape tags in loguru (#304)
|
10 months ago |
AlpinDale
|
9ec611090d
chore: build for more cuda versions
|
10 months ago |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
10 months ago |
AlpinDale
|
132d9927cb
fix: speedup runtime update script
|
10 months ago |
Stefan Gligorijevic
|
7380c2c3ff
chore: update gxx to 11.3 (#282)
|
10 months ago |
Aykut Akgün
|
cbe37e8b18
fix: speed up cuda home detection (#288)
|
10 months ago |
AlpinDale
|
a98babfb74
fix: bnb on Turing GPUs (#299)
|
10 months ago |
AlpinDale
|
49793d7c5a
fix: bump bnb kernels to sm_80 due to async stream copies
|
10 months ago |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
10 months ago |
AlpinDale
|
82955ba440
fix: backport bnb kernels (#297)
|
10 months ago |
Pyroserenus
|
951077de65
chore: update klite.embd with current version (#296)
|
10 months ago |
sgsdxzy
|
94c1543cae
fix: typo in marlin kernel path (#295)
|
10 months ago |
AlpinDale
|
e0c35bb353
feat: bitsandbytes and `--load-in{4,8}bit` support (#294)
|
10 months ago |
AlpinDale
|
705821a7fe
feat: AQLM quantization support (#293)
|
10 months ago |
AlpinDale
|
a1d8ab9f3e
fix: lora on quantized models (barred gguf) (#292)
|
10 months ago |
AlpinDale
|
2d3d44b3e9
chore: add health check for ray workers (#290)
|
10 months ago |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
10 months ago |
AlpinDale
|
f35d15e632
fix: arg detection for kobold api launch (#286)
|
10 months ago |
AlpinDale
|
72229a94da
feat: better marlin kernels (#285)
|
10 months ago |
AlpinDale
|
769b069e2e
AttributeError fix in OpenAI server
|
10 months ago |
AlpinDale
|
23a7fd8cda
remove ooba endpoint, fix and add deprecation warning for kobold endpoint, fix case where kobold endpoint was always launched with openai (#284)
|
10 months ago |
AlpinDale
|
13d850334e
fix: navi support (#283)
|
10 months ago |
AlpinDale
|
9fa99215f8
feat: add cubic sampling (#280)
|
10 months ago |