Commit History

Autor SHA1 Mensaxe Data
  AlpinDale f2107af0c1 fix: promote another index in fp8 kernel to int64_t hai 5 meses
  AlpinDale 31552a81ff fix: use int64_t for indices in fp8 kernels hai 5 meses
  AlpinDale c8f5424d72 add scale_ub inputs to fp8 dynamic per-token quant hai 5 meses
  AlpinDale 6c4c20652b feat: pipeline parallel support for mixtral hai 5 meses
  AlpinDale 196e6b64f1 feat: add fp8 dynamic per-token quant kernel hai 5 meses
  AlpinDale 37c6da9eb3 feat: vectorized fp8 quant kernel hai 6 meses
  AlpinDale 156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569) hai 6 meses
  AlpinDale 3bdeb3e116 fix: clang formatting for all kernels (#558) hai 6 meses
  AlpinDale 251568470e initial nvidia fp8 e4m3 for kv cache hai 6 meses