.. |
amd
|
251568470e
initial nvidia fp8 e4m3 for kv cache
|
7 meses atrás |
nvidia
|
3bdeb3e116
fix: clang formatting for all kernels (#558)
|
7 meses atrás |
common.cu
|
37c6da9eb3
feat: vectorized fp8 quant kernel
|
7 meses atrás |
fp8_marlin.cu
|
ad24e74a99
feat: FP8 weight-only quantization support for Ampere GPUs
|
6 meses atrás |