Commit History

Author SHA1 Message Date
  AlpinDale 4e71bd1d12 feat: add PagedAttention V2 kernels (#76) 1 year ago
  50h100a d0eadd4dbd Added `min_tokens` and reimplemented `ignore_eos` using a new logit processor (#70) 1 year ago
  AlpinDale 04a27c6aeb fix: revert mirostat v2 (#79) 1 year ago
  AlpinDale 9c353a0e02 fix: unnecessary import 1 year ago
  AlpinDale ce5e2332ea fix: launch AWQ kernels on the current CUDAStream (#75) 1 year ago
  AlpinDale 561773dec8 fix: hopefully fixes github actions 1 year ago
  AlpinDale 2c1d6a8cf2 fix: fast tokenizer latency reduction 1 year ago
  AlpinDale 28db67fd78 fix: mistral support 1 year ago
  AlpinDale d8f04f29a9 readme: update sampling params 1 year ago
  Stefan Gligorijevic 5acc27adeb chore: fix parameter validation on ooba endpoint 1 year ago
  Stefan Gligorijevic 93daff0384 chore: delete leftover debug prints 1 year ago
  Stefan Gligorijevic 5dbd262033 feat: Mirostat v2 (#69) 1 year ago
  AlpinDale f393dc2af1 fix: broken GPTQ layer 1 year ago
  AlpinDale 3bf6197afb fix: prompt processing delay introduced by #66 (#71) 1 year ago
  AlpinDale 9df91fe863 bump version to 0.3.6 1 year ago
  AlpinDale 2b42a1ada2 bump the version to 0.3.5 1 year ago
  AlpinDale c55c8f7bd8 update readme 1 year ago
  AlpinDale 380206038e fix: change the timing of logit sorting (#66) 1 year ago
  AlpinDale bdad759503 feat: YaRN context window extension support (#67) 1 year ago
  AlpinDale f04588203e feat: mistral AWQ support and file blacklisting 1 year ago
  AlpinDale 7572e1dd59 overflow in AWQ GEMM kernel 1 year ago
  AlpinDale c1fa7e8567 chore: fix datatype check (#65) 1 year ago
  AlpinDale a6a4220fa6 feat: refactor megatron and quants (#57) 1 year ago
  AlpinDale 9a9e59b871 update readme with new instructions 1 year ago
  g4rg 16bf6b61a3 fix: requests stalling in KAI non-streaming endpoint (#46) 1 year ago
  LitreallyNone b526a7b3bc Update requirements.txt (#58) 1 year ago
  AlpinDale 2e70a6d5ed chore: allow the user to specify install method (#56) 1 year ago
  official-elinas 46e472062a chore: make NVCC work for different versions (#55) 1 year ago
  AlpinDale 6682ede3de fix: clean up API servers 1 year ago
  henk717 0b2b62fe96 Micromamba Runtime (#54) 1 year ago