AlpinDale
|
5fbd93797d
fix: beta value in gelu_tanh kernel being divided by 0.5
|
4 months ago |
AlpinDale
|
c0c336aaa3
refactor: registry for processing model inputs; quick_gelu; clip model support
|
5 months ago |
AlpinDale
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
5 months ago |
AlpinDale
|
3d6695cfbb
feat: add approximate gelu activation kernels (#370)
|
9 months ago |
AlpinDale
|
e31c6f0b45
feat: refactor modeling logic and support more models (#274)
|
10 months ago |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 months ago |
AlpinDale
|
b9b295d74e
chore: backlogs 1 (#191)
|
1 year ago |
AlpinDale
|
7d91e9e0f2
feat: CUDA graphs (#172)
|
1 year ago |
AlpinDale
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 year ago |
AlpinDale
|
5175605f8d
fix: yarn (#112)
|
1 year ago |
AlpinDale
|
3d72f05c7b
feat: flattened 1D tensor -> 2D tensor (#85)
|
1 year ago |
AlpinDale
|
32844c1522
add GELU kernels and remove compile bloat
|
1 year ago |
AlpinDale
|
081545bde6
fix: various CUDA kernel tweaks
|
1 year ago |
AlpinDale
|
b8f4337c5b
chore: various fixes
|
1 year ago |
AlpinDale
|
28866137ea
feat: add swiglu activation
|
1 year ago |