Commit History

Author SHA1 Message Date
  AlpinDale f8dfac6372 chore: attention refactor and upstream sync apr01 (#365) 9 months ago
  AlpinDale 9810daa699 feat: INT8 KV Cache (#298) 10 months ago
  AlpinDale 23389d0108 zero out a variable instead of vector in kernels 1 year ago
  AlpinDale 081545bde6 fix: various CUDA kernel tweaks 1 year ago
  AlpinDale 05d0a7e763 feat: adapt the attention kernels 1 year ago
  AlpinDale 3c3944153c feat: add generic attention and FP32 dtype kernels 1 year ago