Historique des commits

Auteur SHA1 Message Date
  AlpinDale 4ed1bb9958 chore: add fault tolerance for RayTokenizerGroupPool il y a 7 mois
  AlpinDale 622de63c03 fix: remove useless code from cpu worker il y a 7 mois
  AlpinDale 80ac1cdc8f fix: add args for the draft tp il y a 7 mois
  AlpinDale abbb730607 feat: support draft model on different tensor parallel size il y a 7 mois
  AlpinDale 5974495461 chore: phi3v resize for dynamic shape il y a 7 mois
  AlpinDale e238abf0cc chore: send and recv helper functions il y a 7 mois
  AlpinDale 051daa0435 fix: add cutlass2x fallback kernels il y a 7 mois
  AlpinDale 3389dcdde5 fix: why the hell was this not committed? il y a 7 mois
  AlpinDale 25feb1d592 chore: add support for pinning lora adapters in the lru cache il y a 7 mois
  AlpinDale 0c662bc813 fix: exclude modelscope 1.15.0 il y a 7 mois
  AlpinDale b3d2b639d2 feat: add `gelu_quick` CPU kernel il y a 7 mois
  AlpinDale 1b340083b1 feat: add shm broadcast il y a 7 mois
  AlpinDale 8a6e83b52e feat: fully sharded QKVParallelLinearWithLora support il y a 7 mois
  AlpinDale 4f42985b5c feat: qwen2 lora shapes il y a 7 mois
  AlpinDale af43576da0 feat: add MLPSpeculator speculative decoding support (#572) il y a 7 mois
  AlpinDale ead08e9711 fix: missing next_pow_2 header function il y a 7 mois
  AlpinDale 7a3e38f79c fix: cutlass kernel compilation il y a 7 mois
  AlpinDale 017b42c517 chore: use fork as the default method for mp backend il y a 7 mois
  AlpinDale cd9ed8623b fix: cuda version check for fp8 support in the cutlass kernels il y a 7 mois
  AlpinDale fad77538de feat: update cutlass int8 kernel configs for sm90 il y a 7 mois
  AlpinDale b753ff7870 feat: per-channel support for static activation quant il y a 7 mois
  AlpinDale 3c7444c89b fix: asyncio.run hangs in python < 3.12 il y a 7 mois
  AlpinDale d44ac8e497 fix: `--preemption_mode` -> `--preemption-mode` il y a 7 mois
  AlpinDale bcf9c83e6a fix: incorrect args passed to generate() method in phi3v example il y a 7 mois
  AlpinDale 025322ee5f fix: fp8 kv cache for qwen2 models il y a 7 mois
  AlpinDale 323fe23b21 chore: use 127.0.0.1 for single-node setups il y a 7 mois
  AlpinDale 89be49d058 fix: build for mi300x il y a 7 mois
  AlpinDale 7d3da17e19 fix: phi3 rope scaling il y a 7 mois
  AlpinDale 765adcfba1 chore: add w8a8 benchmark scripts il y a 7 mois
  AlpinDale 1587fab5de fix: cuda version check for mma warning suppression il y a 7 mois