Commit History

Autor SHA1 Mensaxe Data
  AlpinDale a985143768 core: add cuda graph support for encoder-decoder models (#1051) hai 4 semanas
  AlpinDale c951a54d21 fix: multi-step + flashinfer with cuda graphs (#1036) hai 4 semanas
  AlpinDale 1390915778 multi-step: add support for flashinfer attention backend (#1033) hai 4 semanas
  AlpinDale 3bb0f07461 chore: rename `task_handler` to `worker` (#985) hai 1 mes
  AlpinDale d6cbbba95f Revert "fix: issues with flashinfer fp8 kv (#950)" (#956) hai 1 mes
  AlpinDale cef6da8863 fix: issues with flashinfer fp8 kv (#950) hai 1 mes
  AlpinDale 4ddc14d653 core: use flashinfer for FP8 KV when available (#944) hai 1 mes
  AlpinDale 1405051912 attention: add `AttentionState` abstraction (#863) hai 1 mes
  AlpinDale 60b702a827 chore: register custom torch ops for flash-attn and flashinfer (#724) hai 4 meses
  AlpinDale 300f889554 chore: update flashinfer to v0.1.3 (#685) hai 4 meses
  AlpinDale 67ee885293 fix: flashinfer outputs (#657) hai 4 meses
  AlpinDale f1d0b77c92 [0.6.0] Release Candidate (#481) hai 4 meses