david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ 71a26f0998ce6eac8a8ec6f49484a1a857e18fbb

AlpinDale e9c0a248dc fix: support check for fp8 cutlass		7 месяцев назад
..
compressed_tensors	b2cb5a92e9 fix: missing cache_config for dbrx	7 месяцев назад
gguf_utils	9d81716bfd [v0.5.3] Release Candidate (#388)	10 месяцев назад
__init__.py	690110a051 feat: bitsandbytes quantization	7 месяцев назад
aqlm.py	2649f3f14e aqlm works on pascal	7 месяцев назад
autoquant.py	0307da9e15 refactor: bitsandbytes -> autoquant	7 месяцев назад
awq.py	c66b1b57b1 Marlin 2:4 sparsity (#555)	7 месяцев назад
base_config.py	c66b1b57b1 Marlin 2:4 sparsity (#555)	7 месяцев назад
bitsandbytes.py	690110a051 feat: bitsandbytes quantization	7 месяцев назад
deepspeedfp.py	4acf34417a feat: add DeepSpeedFP quantization for all models	7 месяцев назад
eetq.py	b178ae4b4a chore: generalize linear_method to be quant_method (#540)	8 месяцев назад
exl2.py	b178ae4b4a chore: generalize linear_method to be quant_method (#540)	8 месяцев назад
fp8.py	e9c0a248dc fix: support check for fp8 cutlass	7 месяцев назад
gguf.py	b178ae4b4a chore: generalize linear_method to be quant_method (#540)	8 месяцев назад
gptq.py	c66b1b57b1 Marlin 2:4 sparsity (#555)	7 месяцев назад
gptq_marlin.py	5cedee9024 fix gemma with gptq marlin	7 месяцев назад
gptq_marlin_24.py	f6250c5516 move dockerfiles to root; fix cpu build	7 месяцев назад
hadamard.safetensors	9d81716bfd [v0.5.3] Release Candidate (#388)	10 месяцев назад
marlin.py	c66b1b57b1 Marlin 2:4 sparsity (#555)	7 месяцев назад
quip.py	c66b1b57b1 Marlin 2:4 sparsity (#555)	7 месяцев назад
quip_utils.py	9d81716bfd [v0.5.3] Release Candidate (#388)	10 месяцев назад
schema.py	9d81716bfd [v0.5.3] Release Candidate (#388)	10 месяцев назад
squeezellm.py	f6250c5516 move dockerfiles to root; fix cpu build	7 месяцев назад