AlpinDale
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
il y a 9 mois |
AlpinDale
|
5053743c1c
feat: speedup AWQ (#223)
|
il y a 11 mois |
AlpinDale
|
ce5e2332ea
fix: launch AWQ kernels on the current CUDAStream (#75)
|
il y a 1 an |
AlpinDale
|
7572e1dd59
overflow in AWQ GEMM kernel
|
il y a 1 an |
AlpinDale
|
9f7a0e3ecb
feat: AWQ support for Turing GPUs (#53)
|
il y a 1 an |
AlpinDale
|
75c27d3e65
massive overhaul
|
il y a 1 an |
AlpinDale
|
798e6923f1
align CUDA kernels with original AWQ impl
|
il y a 1 an |
AlpinDale
|
d9c1d4f6e5
add awq support
|
il y a 1 an |
AlpinDale
|
39beed0b87
Revert "Refactor AWQ support."
|
il y a 1 an |
AlpinDale
|
579071b570
Revert "fix the awq gemm kernels"
|
il y a 1 an |
AlpinDale
|
20c27863c1
fix the awq gemm kernels
|
il y a 1 an |
AlpinDale
|
d09e27f5d4
Refactor AWQ support.
|
il y a 1 an |