Commit Verlauf

Autor SHA1 Nachricht Datum
  AlpinDale b01eec7c35 stop workflows on dev vor 11 Monaten
  AlpinDale f01c668259 clean up sampler vor 11 Monaten
  AlpinDale fa6af97a5a add new logits processor vor 11 Monaten
  AlpinDale 78d66f16d1 Chunked Prefill Part 1 (#384) vor 11 Monaten
  AlpinDale 9181fa0396 feat: Triton kernels for sampling (#383) vor 11 Monaten
  AlpinDale e3252edd07 fix: remove event and stream, add typing (#382) vor 11 Monaten
  AlpinDale 375f24ccca fix: optimize context shift performance (#380) vor 11 Monaten
  AlpinDale 33b3786175 fix: cache neuron checks (#379) vor 11 Monaten
  AlpinDale 8c9cabf4c8 fix: display error in ray before deadlock (#378) vor 11 Monaten
  AlpinDale f587953f46 fix: yapf vor 11 Monaten
  AlpinDale 4b99ac15b7 fix: do not deepcopy metadata vor 11 Monaten
  AlpinDale 17b034613d chore: make metadata a dataclass (#377) vor 11 Monaten
  AlpinDale 9534fcfb7b fix: build error vor 11 Monaten
  AlpinDale 0b35176089 feat: add context-free grammars (#376) vor 11 Monaten
  AlpinDale feb5840f2a feat: async tokenization (#374) vor 11 Monaten
  IggoOnCode 2aec297c55 feat: add embeddings endpoint to openai rest-api server. (#363) vor 11 Monaten
  AlpinDale 29c241c115 fix: explicitly disallow installation on non-linux platforms (#373) vor 11 Monaten
  AlpinDale 439a826712 fix: broadcast group vor 11 Monaten
  AlpinDale 935027bdcc feat: dynamic shared memory allocation for moe align block size (#372) vor 11 Monaten
  AlpinDale 97a2b26c97 fix: assertion error when use_sliding_window is present vor 11 Monaten
  AlpinDale e702f587cf feat: add batched RoPE kernels (#371) vor 11 Monaten
  AlpinDale 3d6695cfbb feat: add approximate gelu activation kernels (#370) vor 11 Monaten
  AlpinDale 5fa15b4435 fix: double free with sliding window (#369) vor 11 Monaten
  AlpinDale 72cd8494aa feat: mistral neuron support (#368) vor 11 Monaten
  AlpinDale 0f6d56b07f feat: model executor refactor (#367) vor 11 Monaten
  AlpinDale b361096463 fix: tokenizer when using ray (#366) vor 11 Monaten
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) vor 11 Monaten
  50h100a a39920bc99 Merge pull request #355 from 50h100a/pr_seedfix vor 11 Monaten
  50h100a 051c60736e Merge pull request #356 from 50h100a/pr_samplerinternals vor 11 Monaten
  50h100a d5dbd29db4 hoist sampler internals into a single function. vor 11 Monaten