.. |
broadcast_load_epilogue_c2x.hpp
|
54f4f1e7f3
allow the cutlass kernels to take scales that reside on the GPU
|
5 months ago |
broadcast_load_epilogue_c3x.hpp
|
94f4e278ff
fix: illegal mem access for cutlass fp8 kernels
|
5 months ago |
common.hpp
|
051daa0435
fix: add cutlass2x fallback kernels
|
5 months ago |
scaled_mm_c2x.cu
|
5b464d36ea
feat: bias epilogue support for cutlass kernels
|
5 months ago |
scaled_mm_c3x.cu
|
b03b4d4c8c
fix: compute cutlass 3.x epilogues in fp32 instead of 16
|
5 months ago |
scaled_mm_entry.cu
|
e7e847c3df
fix: turn off cutlass scaled_mm for ada lovelace cards
|
4 months ago |