AlpinDale
|
feb5840f2a
feat: async tokenization (#374)
|
9 months ago |
IggoOnCode
|
2aec297c55
feat: add embeddings endpoint to openai rest-api server. (#363)
|
9 months ago |
AlpinDale
|
29c241c115
fix: explicitly disallow installation on non-linux platforms (#373)
|
9 months ago |
AlpinDale
|
439a826712
fix: broadcast group
|
9 months ago |
AlpinDale
|
935027bdcc
feat: dynamic shared memory allocation for moe align block size (#372)
|
9 months ago |
AlpinDale
|
97a2b26c97
fix: assertion error when use_sliding_window is present
|
9 months ago |
AlpinDale
|
e702f587cf
feat: add batched RoPE kernels (#371)
|
9 months ago |
AlpinDale
|
3d6695cfbb
feat: add approximate gelu activation kernels (#370)
|
9 months ago |
AlpinDale
|
5fa15b4435
fix: double free with sliding window (#369)
|
9 months ago |
AlpinDale
|
72cd8494aa
feat: mistral neuron support (#368)
|
9 months ago |
AlpinDale
|
0f6d56b07f
feat: model executor refactor (#367)
|
9 months ago |
AlpinDale
|
b361096463
fix: tokenizer when using ray (#366)
|
9 months ago |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
9 months ago |
50h100a
|
35b4aa7da5
Fix logitproc for logit_bias in OAI endpoints.
|
9 months ago |
50h100a
|
7ed57e318d
Overhauled SamplingTensors construction.
|
9 months ago |
50h100a
|
a39920bc99
Merge pull request #355 from 50h100a/pr_seedfix
|
9 months ago |
50h100a
|
051c60736e
Merge pull request #356 from 50h100a/pr_samplerinternals
|
9 months ago |
50h100a
|
d5dbd29db4
hoist sampler internals into a single function.
|
9 months ago |
50h100a
|
b9e0ae87c5
fix fine-grained seeding.
|
9 months ago |
sgsdxzy
|
6ebac34dc1
chore: cleaner pre-llamafied Yi implementation (#352)
|
9 months ago |
AlpinDale
|
681e94611f
fix: restore backwards compatibility with old Yi models (#351)
|
9 months ago |
AlpinDale
|
1b6732fcde
chore: bump transformers version
|
9 months ago |
Absurd
|
070c1cef8c
fix: explicit RFC3986 for prometheus_client asgi (#344)
|
9 months ago |
Stefan Daniel Schwarz
|
5d747cfc4d
readme: docker docs (#340)
|
9 months ago |
Stefan Daniel Schwarz
|
8e259ee7cf
chore: hf_transfer for faster downloads (#339)
|
9 months ago |
AlpinDale
|
398a97338a
feat: enable lora loading/unloading via API (#337)
|
9 months ago |
Stefan Daniel Schwarz
|
b0688b6b9c
fix: docker port and kobold api (#338)
|
9 months ago |
AlpinDale
|
ed225f59cb
fix: transformers in requirements
|
9 months ago |
AlpinDale
|
e120404436
Revert "feat: CMake Build System Generator (#332)"
|
9 months ago |
AlpinDale
|
06312251a7
fix: explictly export CUDA arches for CI
|
9 months ago |