.. |
attention
|
53d391e1f2
merge 'dev' into 'main'
|
1 year ago |
quantization
|
15a0454172
feat: FP8 KV Cache (#185)
|
1 year ago |
activation_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
1 year ago |
cache.h
|
15a0454172
feat: FP8 KV Cache (#185)
|
1 year ago |
cache_kernels.cu
|
15a0454172
feat: FP8 KV Cache (#185)
|
1 year ago |
cuda_compat.h
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
cuda_utils.h
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
cuda_utils_kernels.cu
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
dispatch_utils.h
|
7e72ce0a73
feat: mixtral tensor parallelism (#193)
|
1 year ago |
layernorm_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
1 year ago |
misc_kernels.cu
|
7e72ce0a73
feat: mixtral tensor parallelism (#193)
|
1 year ago |
ops.h
|
7e72ce0a73
feat: mixtral tensor parallelism (#193)
|
1 year ago |
pos_encoding_kernels.cu
|
b9b295d74e
chore: backlogs 1 (#191)
|
1 year ago |
pybind.cpp
|
7e72ce0a73
feat: mixtral tensor parallelism (#193)
|
1 year ago |
reduction.cuh
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |