sgsdxzy
|
fe7844f2ef
feat: sharding and safetensors support for gguf conversion (#256)
|
11 months ago |
Stefan Gligorijevic
|
c57c774947
Accept Kobold sampler IDs engine wide
|
11 months ago |
Stefan Gligorijevic
|
5e3affe0ff
what was this?
|
11 months ago |
Stefan Gligorijevic
|
3fc7490cb5
add quad to sampler order
|
11 months ago |
Stefan Gligorijevic
|
24e59a1c04
Merge remote-tracking branch 'upstream/main' into sampler_order_v2
|
11 months ago |
AlpinDale
|
8635901c76
fix: s-lora vocab embeddings
|
11 months ago |
AlpinDale
|
c76b611021
docker: update the Dockerfile and push the latest image (#254)
|
11 months ago |
anon998
|
35b9033782
fix: crash in quadratic sampling when batch > 1 (#253)
|
11 months ago |
AlpinDale
|
84f535dcb5
Merge branch 'main' into sampler_order_v2
|
11 months ago |
AlpinDale
|
842912d022
feat: on-the-fly gguf conversion (#250)
|
11 months ago |
AlpinDale
|
faca8745d6
fix: linting issue (#249)
|
11 months ago |
AlpinDale
|
3163839c88
bump version to 0.4.9
|
11 months ago |
AlpinDale
|
f99eb2c874
fix: hadamard tensors not included in wheel
|
11 months ago |
AlpinDale
|
8b6790d504
fix: gguf config not recognized
|
11 months ago |
AlpinDale
|
a1836a40e2
bump version to v0.4.8
|
11 months ago |
AlpinDale
|
2bd6c92f73
fix: lora inclusion in wheels
|
11 months ago |
AlpinDale
|
8da2be03ce
feat: bump version to v0.4.7 (#248)
|
11 months ago |
AlpinDale
|
ea0f57b233
feat: allow further support for non-cuda devices (#247)
|
11 months ago |
AlpinDale
|
4faf78ba29
fix: grab correct quant config from revisions (#246)
|
11 months ago |
AlpinDale
|
7760913873
fix: garbage output from GPTQ (#245)
|
11 months ago |
50h100a
|
f619c96c79
fix: zero token output due to temperature bias (#243)
|
11 months ago |
50h100a
|
53a9c60442
fix: logit processor declarations and application (#242)
|
11 months ago |
AlpinDale
|
9ed45fec7c
fix: incorrect prometheus url
|
11 months ago |
AlpinDale
|
d2db4143fa
feat: add grafana for metrics (#240)
|
11 months ago |
AlpinDale
|
1a94ccf3cf
fix: prefix cache fail with lora (#239)
|
11 months ago |
AlpinDale
|
85c92acfb3
fix: do not initialize all-reduce at world_size=1
|
11 months ago |
AlpinDale
|
d9b65e6c5f
feat: DeepSeek MoE support (#237)
|
11 months ago |
AlpinDale
|
e73a92ad2f
fix: remove the mask for quadratic sampling (#236)
|
11 months ago |
AlpinDale
|
aebd68c632
feat: backport kernels (#235)
|
11 months ago |
AlpinDale
|
bb158b6282
fix: bump torch to 2.2.0 (#234)
|
11 months ago |