Commit History

Autor SHA1 Mensaxe Data
  AlpinDale 4f229078cf sampler: vectorized dry sampler hai 1 mes
  AlpinDale 1405051912 attention: add `AttentionState` abstraction (#863) hai 1 mes
  AlpinDale 82eabb6aa7 build: add jinja2 to requirements file (#862) hai 1 mes
  AlpinDale 9094a8a2a3 xpu: refactor XPU worker & executor (#861) hai 1 mes
  AlpinDale 8b8d2ce7e2 ci: bump aphrodite version to 0.6.4.post1 (#859) hai 1 mes
  AlpinDale 3392b81bf9 sampler: allow parsing sampler order using strings (#858) hai 1 mes
  AlpinDale 0035dc42ed sampler: optimize DRY performance using z-algorithm (#856) hai 1 mes
  AlpinDale 2150bb5019 sampler: add range parameter for DRY (#855) hai 1 mes
  AlpinDale 72c505ad84 sampler: fix dry concurrency issue (#852) hai 1 mes
  Selali 14ac216498 sampler: add output_tokens to DRY sampler (#849) hai 1 mes
  Luke Harold Miles d486d7ac01 docs: add linux arm64/aarch64/GH200 installation tips (#851) hai 1 mes
  AlpinDale d2971a6831 ci: bump version to 0.6.4 (#845) hai 1 mes
  AlpinDale 538471f76e chore: bump mistral_common to 1.5.0 (#844) hai 1 mes
  AlpinDale 483c9e6e59 fix: disable awq_marlin override for awq models (#843) hai 1 mes
  AlpinDale dfa34d1b24 feat: add sampler_priorty (#837) hai 1 mes
  AlpinDale 93bc863591 feat: Machete Kernels for Hopper GPUs (#842) hai 1 mes
  AlpinDale 563e8f7ac8 fix: latency and serving benchmarks (#841) hai 1 mes
  AlpinDale 7c7ec12f36 chore: refactor executor classes for easier inheritance (#840) hai 1 mes
  AlpinDale 16b587c104 fix: hidden states handling in batch expansion for spec decoding (#839) hai 1 mes
  AlpinDale 60f7b828d5 feat: add skew sampling (#834) hai 1 mes
  AlpinDale ba9d8f631a feat: add no_repeat_ngram sampler (#832) hai 2 meses
  Selali 4c4a365f77 feat: Add DRY (Don't Repeat Yourself) sampling (#827) hai 2 meses
  AlpinDale 48a8693aed feat: multi-step scheduling (#831) hai 2 meses
  AlpinDale 2242cb25dc fix: unbound tokenizer error hai 2 meses
  AlpinDale 3d83e64f8e feat: add metrics for prefix cache hit rate (#829) hai 2 meses
  AlpinDale 22425b689d fix: XPU build hai 2 meses
  AlpinDale bfc8988116 feat: add cuda sampling kernels for top_k and top_p (#828) hai 2 meses
  AlpinDale 22427602eb feat: add top-nsigma sampling method hai 2 meses
  AlpinDale 22429e4a10 fix: sampler test with new transformers version hai 2 meses
  AlpinDale 2f61644f6e SPMD optimizations (#824) hai 2 meses