AlpinDale
|
f2107af0c1
fix: promote another index in fp8 kernel to int64_t
|
5 ay önce |
AlpinDale
|
31552a81ff
fix: use int64_t for indices in fp8 kernels
|
5 ay önce |
AlpinDale
|
c8f5424d72
add scale_ub inputs to fp8 dynamic per-token quant
|
5 ay önce |
AlpinDale
|
6c4c20652b
feat: pipeline parallel support for mixtral
|
5 ay önce |
AlpinDale
|
196e6b64f1
feat: add fp8 dynamic per-token quant kernel
|
5 ay önce |
AlpinDale
|
37c6da9eb3
feat: vectorized fp8 quant kernel
|
6 ay önce |
AlpinDale
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
6 ay önce |
AlpinDale
|
3bdeb3e116
fix: clang formatting for all kernels (#558)
|
6 ay önce |
AlpinDale
|
251568470e
initial nvidia fp8 e4m3 for kv cache
|
6 ay önce |