david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ 05e45aeb53702dac44913e0241b4ac2dabbe3a87

AlpinDale ad24e74a99 feat: FP8 weight-only quantization support for Ampere GPUs		6 hónapja
..
aqlm	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
autoquant	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
awq	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
compressed_tensors	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
cutlass_w8a8	b03b4d4c8c fix: compute cutlass 3.x epilogues in fp32 instead of 16	7 hónapja
exl2	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
fp8	ad24e74a99 feat: FP8 weight-only quantization support for Ampere GPUs	6 hónapja
gguf	9d81716bfd [v0.5.3] Release Candidate (#388)	10 hónapja
gptq	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
gptq_marlin	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
int8_kvcache	9810daa699 feat: INT8 KV Cache (#298)	1 éve
marlin	1587fab5de fix: cuda version check for mma warning suppression	7 hónapja
quip	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
squeezellm	156f577f79 feat: switch from `PYBIND11_MODULE` to `TORCH_LIBRARY` (#569)	7 hónapja
quant_ops.h	ad24e74a99 feat: FP8 weight-only quantization support for Ampere GPUs	6 hónapja