AlpinDale
|
e53842bd5d
fix: cuda home detection for fp8 kv cache
|
9 months ago |
AlpinDale
|
7411a74cc6
bump version to 0.5.2
|
9 months ago |
AlpinDale
|
ad6802690f
feat: CMake Build System Generator (#332)
|
9 months ago |
AlpinDale
|
da223153c6
feat&fix: cohere support and missing GPU blocks (#333)
|
9 months ago |
AlpinDale
|
e2a7b50440
fix: logprobs when inf or nan (#329)
|
9 months ago |
AlpinDale
|
4791a63fdc
fix: env.py url in bugs template
|
9 months ago |
AlpinDale
|
8071ead964
chore: allow docker port and host to be changed (#327)
|
10 months ago |
AlpinDale
|
594fe814dc
bump version to v0.5.1 (#326)
|
10 months ago |
AlpinDale
|
f8652c8e99
fix: optimize aqlm dequantization (#325)
|
10 months ago |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
10 months ago |
AlpinDale
|
637649df99
fix: model -> model architecture in issue templates
|
10 months ago |
AlpinDale
|
31092ad5ae
fix: issues template
|
10 months ago |
AlpinDale
|
e544814a92
feat: add issue template and an env info collector (#321)
|
10 months ago |
AlpinDale
|
89c32b40ec
chore: add new imatrix quants (#320)
|
10 months ago |
sgsdxzy
|
50c0875c32
chore: log total memory usage (#316)
|
10 months ago |
AlpinDale
|
e82b654ddd
readme: add tabby, fix docker, add colab (#315)
|
10 months ago |
AlpinDale
|
fa07e6db61
docker: build docker for all CUDA arches
|
10 months ago |
drummerv
|
e59dd4a90d
fix: openai gguf chat template (#312)
|
10 months ago |
AlpinDale
|
b3df2351c8
readme: update with bsz1 graph
|
10 months ago |
AlpinDale
|
434dc19961
CI: fix build failure for cuda versions with no torch wheels
|
10 months ago |
AlpinDale
|
968bde81bf
fix: tensor parallel with GPTQ and AWQ quants (#307)
|
10 months ago |
AlpinDale
|
ff898c2c80
bump version to 0.5.0 (#303)
|
10 months ago |
AlpinDale
|
c41462cfcd
feat: exllamav2 quantization (#305)
|
10 months ago |
AlpinDale
|
3a045ebfde
fix: escape tags in loguru (#304)
|
10 months ago |
AlpinDale
|
9ec611090d
chore: build for more cuda versions
|
10 months ago |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
10 months ago |
AlpinDale
|
132d9927cb
fix: speedup runtime update script
|
10 months ago |
Stefan Gligorijevic
|
7380c2c3ff
chore: update gxx to 11.3 (#282)
|
10 months ago |
Aykut Akgün
|
cbe37e8b18
fix: speed up cuda home detection (#288)
|
10 months ago |
AlpinDale
|
a98babfb74
fix: bnb on Turing GPUs (#299)
|
10 months ago |