AlpinDale
|
ad24e74a99
feat: FP8 weight-only quantization support for Ampere GPUs
|
6 months ago |
AlpinDale
|
5be90c3859
Mamba infrastrucuture support (#586)
|
6 months ago |
AlpinDale
|
c0c336aaa3
refactor: registry for processing model inputs; quick_gelu; clip model support
|
7 months ago |
AlpinDale
|
5b464d36ea
feat: bias epilogue support for cutlass kernels
|
7 months ago |
AlpinDale
|
cd9ed8623b
fix: cuda version check for fp8 support in the cutlass kernels
|
7 months ago |
AlpinDale
|
6a57861fca
feat: initial XPU support via intel_extension_for_pytorch (#571)
|
7 months ago |
AlpinDale
|
7e54c3916d
chore: factor out epilogues from cutlass kernels
|
7 months ago |
AlpinDale
|
b4ddd79f3a
fix: warn user when using outdated compiled binary
|
7 months ago |
AlpinDale
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
7 months ago |