david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ 1efd0f89b7351ccfc93cfb0faefa4edd7be5462f

AlpinDale ad24e74a99 feat: FP8 weight-only quantization support for Ampere GPUs		il y a 6 mois
..
aqlm	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
autoquant	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
awq	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
compressed_tensors	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
cutlass_w8a8	b03b4d4c8c fix: compute cutlass 3.x epilogues in fp32 instead of 16	il y a 7 mois
exl2	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
fp8	ad24e74a99 feat: FP8 weight-only quantization support for Ampere GPUs	il y a 6 mois
gguf	9d81716bfd [v0.5.3] Release Candidate (#388)	il y a 10 mois
gptq	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
gptq_marlin	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
int8_kvcache	9810daa699 feat: INT8 KV Cache (#298)	il y a 1 an
marlin	1587fab5de fix: cuda version check for mma warning suppression	il y a 7 mois
quip	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
squeezellm	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	il y a 7 mois
quant_ops.h	ad24e74a99 feat: FP8 weight-only quantization support for Ampere GPUs	il y a 6 mois