david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ rc_054

AlpinDale 208cd5405f fix: cpu offloading with gptq		4 months ago
..
compressed_tensors	4d4e767838 ci: take one of fixing lint issues	4 months ago
gguf_utils	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
utils	141672a0d4 kernels: disambiguate quantized types via a new ScalarType	4 months ago
__init__.py	2b85ffb1a5 chore: minor cleanups	4 months ago
aqlm.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
autoquant.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
awq.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
awq_marlin.py	141672a0d4 kernels: disambiguate quantized types via a new ScalarType	4 months ago
base_config.py	0e6c400b13 feat: re-add GGUF (#600)	4 months ago
bitsandbytes.py	d4c9fcd6e6 feat: support loading pre-quanted bnb checkpoints	4 months ago
deepspeedfp.py	6b1fdd07bd chore: add isort and refactor formatting script and utils	4 months ago
eetq.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
exl2.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
fbgemm_fp8.py	a20e2ce155 fix: pass cutlass_fp8_supported correctly for fbgemm_fp8	4 months ago
fp8.py	4d4e767838 ci: take one of fixing lint issues	4 months ago
gguf.py	4d4e767838 ci: take one of fixing lint issues	4 months ago
gptq.py	208cd5405f fix: cpu offloading with gptq	4 months ago
gptq_marlin.py	208cd5405f fix: cpu offloading with gptq	4 months ago
gptq_marlin_24.py	141672a0d4 kernels: disambiguate quantized types via a new ScalarType	4 months ago
hadamard.safetensors	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
kv_cache.py	e81590d293 fix: `kv_cache_dtype=fp8` without scales for fp8 checkpoints	4 months ago
marlin.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
qqq.py	e3f07b22c3 feat: support for QQQ W4A8 quantization (#612)	4 months ago
quip.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago
quip_utils.py	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
schema.py	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
squeezellm.py	9be43994fe feat: fbgemm quantization support (#601)	5 months ago