Commit History

Author SHA1 Message Date
  AlpinDale 81e7981dce feat: add prometheus production metrics (#154) 1 year ago
  AlpinDale 62b2c4119d feat: re-write GPTQ and refactor exllama kernels (#152) 1 year ago
  AlpinDale 8ed7d56305 feat: OpenAI chat completions templates (#138) 1 year ago
  AlpinDale 653da510d1 chore: rewrite InputMetadata (#143) 1 year ago
  AlpinDale 5dbd5f8c30 fix: quant TP (#129) 1 year ago
  AlpinDale 1334a833a4 feat: AMD ROCm support (#95) 1 year ago
  AlpinDale 63c28919a0 Revert "fix: correct auto ntk scaling_factor for 4k ctx case" (#149) 1 year ago
  g4rg 2c5b0268a4 chore: KoboldAI/koboldcpp updates (#48) 1 year ago
  AlpinDale e386032ae8 fix: rope duplication (#142) 1 year ago
  AlpinDale 2b1ba581f9 feat: re-implement GPTQ (#141) 1 year ago
  AlpinDale 8223f85c1b feat: SqueezeLLM support (#140) 1 year ago
  AlpinDale 9d4e437df9 fix: make llama2 the default sep style (#137) 1 year ago
  AlpinDale 05298f1120 properly disable log requests 1 year ago
  AlpinDale 8b2bbbd98b chore: attention rewrite + models (#135) 1 year ago
  AlpinDale c9bdb3d57a fix: blocktable definition (#134) 1 year ago
  AlpinDale 6c914ea0e4 fix: `SequenceOutputs` -> `SequenceOutput` (#133) 1 year ago
  AlpinDale d4ff350cdb add deprecation warning for ooba API 1 year ago
  AlpinDale fe9637efef chore: initialize model on GPU (#132) 1 year ago
  AlpinDale 1aab8a7d6f feat: speedup compilation times by 3x (#130) 1 year ago
  AlpinDale 237d2ec28d fix: CPU OOM for large models (#128) 1 year ago
  AlpinDale 9ec4e08ade fix: cpu sync delay fix (#127) 1 year ago
  AlpinDale 13901af940 fix: scheduler hang with long prompts (#126) 1 year ago
  AlpinDale 7612f33afd feat: fused add RMSNorm kernels (#125) 1 year ago
  AlpinDale 0d51eac374 feat: awq for all models (#124) 1 year ago
  AlpinDale fd18a1d956 fix: get_tensor instead of pysafeslice 1 year ago
  AlpinDale 5ea6889cea chore: read from quantization_config (#123) 1 year ago
  AlpinDale 3459f1c185 feat: usage stats for OpenAI endpoint (#122) 1 year ago
  AlpinDale dec1133812 feat: phi 1.5 support (#121) 1 year ago
  AlpinDale 7c1e00f51b fix: GH actions for dev branch 1 year ago
  AlpinDale f49cb1ffe1 fix: duplication in engine step (#120) 1 year ago