.. |
attention
|
45f6d9f923
initial refactor commit
|
1 年之前 |
quantization
|
d9c1d4f6e5
add awq support
|
1 年之前 |
activation.cpp
|
32844c1522
add GELU kernels and remove compile bloat
|
1 年之前 |
activation_kernels.cu
|
32844c1522
add GELU kernels and remove compile bloat
|
1 年之前 |
attention.cpp
|
24c78e7306
optimization: multi-query attention kernel
|
1 年之前 |
cache.cpp
|
081545bde6
fix: various CUDA kernel tweaks
|
1 年之前 |
cache_kernels.cu
|
32844c1522
add GELU kernels and remove compile bloat
|
1 年之前 |
dispatch_utils.h
|
32844c1522
add GELU kernels and remove compile bloat
|
1 年之前 |
layernorm.cpp
|
081545bde6
fix: various CUDA kernel tweaks
|
1 年之前 |
layernorm_kernels.cu
|
32844c1522
add GELU kernels and remove compile bloat
|
1 年之前 |
pos_encoding.cpp
|
45f6d9f923
initial refactor commit
|
1 年之前 |
pos_encoding_kernels.cu
|
45f6d9f923
initial refactor commit
|
1 年之前 |
quantization.cpp
|
d9c1d4f6e5
add awq support
|
1 年之前 |
reduction.cuh
|
081545bde6
fix: various CUDA kernel tweaks
|
1 年之前 |