.. |
attention
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
quantization
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
activation_kernels.cu
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
cache.h
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
1 jaar geleden |
cache_kernels.cu
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
cuda_compat.h
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
cuda_utils.h
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
1 jaar geleden |
cuda_utils_kernels.cu
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
dispatch_utils.h
|
32844c1522
add GELU kernels and remove compile bloat
|
1 jaar geleden |
layernorm_kernels.cu
|
7612f33afd
feat: fused add RMSNorm kernels (#125)
|
1 jaar geleden |
ops.h
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
pos_encoding_kernels.cu
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
pybind.cpp
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |
reduction.cuh
|
1334a833a4
feat: AMD ROCm support (#95)
|
1 jaar geleden |