.. |
attention
|
b9b295d74e
chore: backlogs 1 (#191)
|
hai 1 ano |
quantization
|
801eda0b7a
feat: support GPTQ 2, 3, and 8bit quants (#181)
|
hai 1 ano |
activation_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
hai 1 ano |
cache.h
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
hai 1 ano |
cache_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
hai 1 ano |
cuda_compat.h
|
1334a833a4
feat: AMD ROCm support (#95)
|
hai 1 ano |
cuda_utils.h
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
hai 1 ano |
cuda_utils_kernels.cu
|
1334a833a4
feat: AMD ROCm support (#95)
|
hai 1 ano |
dispatch_utils.h
|
32844c1522
add GELU kernels and remove compile bloat
|
hai 1 ano |
layernorm_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
hai 1 ano |
ops.h
|
801eda0b7a
feat: support GPTQ 2, 3, and 8bit quants (#181)
|
hai 1 ano |
pos_encoding_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
hai 1 ano |
pybind.cpp
|
62b2c4119d
feat: re-write GPTQ and refactor exllama kernels (#152)
|
hai 1 ano |
reduction.cuh
|
1334a833a4
feat: AMD ROCm support (#95)
|
hai 1 ano |