Commit History

Author SHA1 Message Date
  AlpinDale e14223dce5 kernel: use `cub::BlockReduce` instead of custom impl (#895) 3 weeks ago
  AlpinDale f1d0b77c92 [0.6.0] Release Candidate (#481) 4 months ago
  AlpinDale 9d81716bfd [v0.5.3] Release Candidate (#388) 8 months ago
  AlpinDale 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 11 months ago
  AlpinDale b9b295d74e chore: backlogs 1 (#191) 1 year ago
  AlpinDale 7612f33afd feat: fused add RMSNorm kernels (#125) 1 year ago
  AlpinDale 3d72f05c7b feat: flattened 1D tensor -> 2D tensor (#85) 1 year ago
  AlpinDale 32844c1522 add GELU kernels and remove compile bloat 1 year ago
  AlpinDale 081545bde6 fix: various CUDA kernel tweaks 1 year ago
  AlpinDale b8f4337c5b chore: various fixes 1 year ago
  AlpinDale 0ec53128b6 feat: add layernorm kernels 1 year ago