david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ chore/test-updates

AlpinDale 071269e406 feat: FP8 E4M3 KV Cache (#405)		9 月之前
..
aqlm	705821a7fe feat: AQLM quantization support (#293)	10 月之前
awq	41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ	9 月之前
bitsandbytes	a98babfb74 fix: bnb on Turing GPUs (#299)	10 月之前
exl2	41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ	9 月之前
fp8	071269e406 feat: FP8 E4M3 KV Cache (#405)	9 月之前
fp8_e5m2_kvcache	8e1cd54497 fix: do not include fp8 for rocm (#271)	10 月之前
gguf	89c32b40ec chore: add new imatrix quants (#320)	10 月之前
gptq	41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ	9 月之前
int8_kvcache	9810daa699 feat: INT8 KV Cache (#298)	10 月之前
marlin	41beab5dc1 add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ	9 月之前
quip	aebd68c632 feat: backport kernels (#235)	11 月之前
squeezellm	8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221)	11 月之前