AlpinDale
|
cda0e93a10
abstract away the platform for device capability
|
hai 7 meses |
AlpinDale
|
0f4a9ee77b
quantized lm_head (#582)
|
hai 7 meses |
AlpinDale
|
7d79c0e726
chore: use nvml query to avoid accidental cuda initialization
|
hai 7 meses |
AlpinDale
|
156f577f79
feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)
|
hai 7 meses |
AlpinDale
|
5cedee9024
fix gemma with gptq marlin
|
hai 7 meses |
AlpinDale
|
5b0c11d190
support pipeline parallel pynccl groups
|
hai 8 meses |
AlpinDale
|
c66b1b57b1
Marlin 2:4 sparsity (#555)
|
hai 8 meses |
AlpinDale
|
ad1c6b86a1
gptq_marlin: enable bfloat16
|
hai 8 meses |
AlpinDale
|
c154578c97
gptq_marlin: 8bit GPTQ support
|
hai 8 meses |
AlpinDale
|
ac5b4b6aa7
broadcast metadata through cpu
|
hai 8 meses |
AlpinDale
|
f22b700ee4
feat: marlin kernels for GPTQ (#547)
|
hai 8 meses |