david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ rc_054

AlpinDale 4d4e767838 ci: take one of fixing lint issues		4 mesiacov pred
..
__init__.py	04b53d2db5 chore: add initializer files	1 rok pred
cache_engine.py	5289c14b24 feat: Asymmetric Tensor Parallel (#594)	4 mesiacov pred
cpu_model_runner.py	705e50f4bd fix: broadcasting logic for multi_modal_kwargs	4 mesiacov pred
cpu_worker.py	42c66d5b00 feat: tensor parallelism for CPU backend	4 mesiacov pred
embedding_model_runner.py	705e50f4bd fix: broadcasting logic for multi_modal_kwargs	4 mesiacov pred
model_runner.py	4d4e767838 ci: take one of fixing lint issues	4 mesiacov pred
model_runner_base.py	d8a51d05a7 fix: seeded gens with pipeline parallel	4 mesiacov pred
neuron_model_runner.py	705e50f4bd fix: broadcasting logic for multi_modal_kwargs	4 mesiacov pred
neuron_worker.py	ae04f57ec1 feat: Pipeline Parallel support (#581)	4 mesiacov pred
openvino_model_runner.py	705e50f4bd fix: broadcasting logic for multi_modal_kwargs	4 mesiacov pred
openvino_worker.py	1ff6d4c3d7 feat: support pipeline parallel on indivisible GPU count (#587)	4 mesiacov pred
tpu_model_runner.py	eef647deab fix: greedy decoding in TPU	4 mesiacov pred
tpu_worker.py	269e9aabda fix: set readonly=True for non-root TPU devices	4 mesiacov pred
worker.py	6979ff658e chore: perform allreduce in fp32 for marlin, better logging	4 mesiacov pred
worker_base.py	523ac99aca chore: pipeline parallel with Ray accelerated dag	4 mesiacov pred
xpu_model_runner.py	705e50f4bd fix: broadcasting logic for multi_modal_kwargs	4 mesiacov pred
xpu_worker.py	99680b2d23 feat: soft prompts (#589)	4 mesiacov pred