david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ c3b15f0926dd24885d9f780f01a451d26fd52653

AlpinDale 3ed4cc431c enc_dec attention code		9 luni în urmă
..
quantization	f8652c8e99 fix: optimize aqlm dequantization (#325)	10 luni în urmă
triton_kernel	e42a78381a feat: switch from pylint to ruff (#322)	10 luni în urmă
__init__.py	07aa2a492f upstream: add option to specify tokenizer	1 an în urmă
activation.py	e31c6f0b45 feat: refactor modeling logic and support more models (#274)	10 luni în urmă
attention.py	58e89e29d9 add custom bias to attention.py	9 luni în urmă
enc_dec_attention.py	3ed4cc431c enc_dec attention code	9 luni în urmă
layernorm.py	e31c6f0b45 feat: refactor modeling logic and support more models (#274)	10 luni în urmă
linear.py	e42a78381a feat: switch from pylint to ruff (#322)	10 luni în urmă
rejection.py	95bdd35ec9 feat: rejection sampler (#197)	1 an în urmă
rotary_embedding.py	e42a78381a feat: switch from pylint to ruff (#322)	10 luni în urmă
sampler.py	da223153c6 feat&fix: cohere support and missing GPU blocks (#333)	9 luni în urmă
vocab_parallel_embedding.py	968bde81bf fix: tensor parallel with GPTQ and AWQ quants (#307)	10 luni în urmă