Commit History

Author SHA1 Message Date
  AlpinDale 8da2be03ce feat: bump version to v0.4.7 (#248) 1 year ago
  AlpinDale ea0f57b233 feat: allow further support for non-cuda devices (#247) 1 year ago
  AlpinDale 4faf78ba29 fix: grab correct quant config from revisions (#246) 1 year ago
  AlpinDale 7760913873 fix: garbage output from GPTQ (#245) 1 year ago
  50h100a f619c96c79 fix: zero token output due to temperature bias (#243) 1 year ago
  50h100a 53a9c60442 fix: logit processor declarations and application (#242) 1 year ago
  50h100a 2e3318c1fa yapf considers this space to be CRITICAL 1 year ago
  AlpinDale 9ed45fec7c fix: incorrect prometheus url 1 year ago
  50h100a 25acebe33d better variable naming 1 year ago
  AlpinDale d2db4143fa feat: add grafana for metrics (#240) 1 year ago
  AlpinDale 1a94ccf3cf fix: prefix cache fail with lora (#239) 1 year ago
  50h100a 7b3bb995c1 topk as linear write 1 year ago
  AlpinDale 85c92acfb3 fix: do not initialize all-reduce at world_size=1 1 year ago
  AlpinDale d9b65e6c5f feat: DeepSeek MoE support (#237) 1 year ago
  AlpinDale e73a92ad2f fix: remove the mask for quadratic sampling (#236) 1 year ago
  AlpinDale aebd68c632 feat: backport kernels (#235) 1 year ago
  AlpinDale bb158b6282 fix: bump torch to 2.2.0 (#234) 1 year ago
  AlpinDale 1c46fa31ad feat: add quadratic sampling (#233) 1 year ago
  AlpinDale f0dacc17dd fix: remove fast-hadamard-transform in requirements 1 year ago
  AlpinDale 5d288aa76c feat: add fast hadamard transformation kernels (#232) 1 year ago
  AlpinDale 12fb635f70 readme: add docker 1 year ago
  AlpinDale eb8698c7bd readme: update with new benchmarks 1 year ago
  AlpinDale 59df05f341 feat: add `/metrics` route for kobold (#229) 1 year ago
  AlpinDale c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 year ago
  AlpinDale 6305e6f3f2 fix: no repeated IPC registration (#227) 1 year ago
  AlpinDale 0adab894fe feat: grammar support (#206) 1 year ago
  AlpinDale 31c95011a6 feat: FP8 E5M2 KV Cache (#226) 1 year ago
  AlpinDale c0146ed00e chore: slight refactor for async engine finish (#225) 1 year ago
  AlpinDale 339c6aec53 chore: bump ray version 1 year ago
  AlpinDale 641bb0f6e9 feat: add custom allreduce kernels (#224) 1 year ago