AlpinDale
|
81e7981dce
feat: add prometheus production metrics (#154)
|
1 year ago |
AlpinDale
|
62b2c4119d
feat: re-write GPTQ and refactor exllama kernels (#152)
|
1 year ago |
AlpinDale
|
8ed7d56305
feat: OpenAI chat completions templates (#138)
|
1 year ago |
AlpinDale
|
653da510d1
chore: rewrite InputMetadata (#143)
|
1 year ago |
AlpinDale
|
5dbd5f8c30
fix: quant TP (#129)
|
1 year ago |
AlpinDale
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 year ago |
AlpinDale
|
63c28919a0
Revert "fix: correct auto ntk scaling_factor for 4k ctx case" (#149)
|
1 year ago |
g4rg
|
2c5b0268a4
chore: KoboldAI/koboldcpp updates (#48)
|
1 year ago |
AlpinDale
|
e386032ae8
fix: rope duplication (#142)
|
1 year ago |
AlpinDale
|
2b1ba581f9
feat: re-implement GPTQ (#141)
|
1 year ago |
AlpinDale
|
8223f85c1b
feat: SqueezeLLM support (#140)
|
1 year ago |
AlpinDale
|
9d4e437df9
fix: make llama2 the default sep style (#137)
|
1 year ago |
AlpinDale
|
05298f1120
properly disable log requests
|
1 year ago |
AlpinDale
|
8b2bbbd98b
chore: attention rewrite + models (#135)
|
1 year ago |
AlpinDale
|
c9bdb3d57a
fix: blocktable definition (#134)
|
1 year ago |
AlpinDale
|
6c914ea0e4
fix: `SequenceOutputs` -> `SequenceOutput` (#133)
|
1 year ago |
AlpinDale
|
d4ff350cdb
add deprecation warning for ooba API
|
1 year ago |
AlpinDale
|
fe9637efef
chore: initialize model on GPU (#132)
|
1 year ago |
AlpinDale
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
1 year ago |
AlpinDale
|
237d2ec28d
fix: CPU OOM for large models (#128)
|
1 year ago |
AlpinDale
|
9ec4e08ade
fix: cpu sync delay fix (#127)
|
1 year ago |
AlpinDale
|
13901af940
fix: scheduler hang with long prompts (#126)
|
1 year ago |
AlpinDale
|
7612f33afd
feat: fused add RMSNorm kernels (#125)
|
1 year ago |
AlpinDale
|
0d51eac374
feat: awq for all models (#124)
|
1 year ago |
AlpinDale
|
fd18a1d956
fix: get_tensor instead of pysafeslice
|
1 year ago |
AlpinDale
|
5ea6889cea
chore: read from quantization_config (#123)
|
1 year ago |
AlpinDale
|
3459f1c185
feat: usage stats for OpenAI endpoint (#122)
|
1 year ago |
AlpinDale
|
dec1133812
feat: phi 1.5 support (#121)
|
1 year ago |
AlpinDale
|
7c1e00f51b
fix: GH actions for dev branch
|
1 year ago |
AlpinDale
|
f49cb1ffe1
fix: duplication in engine step (#120)
|
1 year ago |