.. |
broadcast_load_epilogue_c2x.hpp
|
54f4f1e7f3
allow the cutlass kernels to take scales that reside on the GPU
|
7 月之前 |
broadcast_load_epilogue_c3x.hpp
|
54f4f1e7f3
allow the cutlass kernels to take scales that reside on the GPU
|
7 月之前 |
common.hpp
|
2313c97e3d
add cutlass w8a8 kernels (#556)
|
7 月之前 |
scaled_mm_dq_c2x.cu
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
7 月之前 |
scaled_mm_dq_c3x.cu
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
7 月之前 |
scaled_mm_dq_entry.cu
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
7 月之前 |