Commit History

Author SHA1 Message Date
  AlpinDale 40f63268ee disable new layernorm kernels for CUDA < 12.0 9 months ago
  AlpinDale d68fad5a79 feat: add optimized layernorm kernels (#398) 9 months ago
  AlpinDale 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 1 year ago
  AlpinDale b9b295d74e chore: backlogs 1 (#191) 1 year ago
  AlpinDale 7612f33afd feat: fused add RMSNorm kernels (#125) 1 year ago
  AlpinDale 3d72f05c7b feat: flattened 1D tensor -> 2D tensor (#85) 1 year ago
  AlpinDale 32844c1522 add GELU kernels and remove compile bloat 1 year ago
  AlpinDale 081545bde6 fix: various CUDA kernel tweaks 1 year ago
  AlpinDale b8f4337c5b chore: various fixes 1 year ago
  AlpinDale 0ec53128b6 feat: add layernorm kernels 1 year ago