david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ 4599c98f9904cdd14e0cad6d40b4d091db1099ee

AlpinDale 4599c98f99 feat: dynamic image size support for VLMs		7 місяців тому
..
fused_moe	cf472315cc refactor: isolate FP8 from mixtral	7 місяців тому
mamba	5be90c3859 Mamba infrastrucuture support (#586)	7 місяців тому
ops	fca911ee0a vLLM Upstream Sync (#526)	8 місяців тому
__init__.py	07aa2a492f upstream: add option to specify tokenizer	1 рік тому
activation.py	c0c336aaa3 refactor: registry for processing model inputs; quick_gelu; clip model support	7 місяців тому
layernorm.py	6a57861fca feat: initial XPU support via intel_extension_for_pytorch (#571)	7 місяців тому
linear.py	ddb3323f94 refactor: have w8a8 compressed tensors use `process_weights_after_load` for fp8	7 місяців тому
logits_processor.py	0f4a9ee77b quantized lm_head (#582)	7 місяців тому
pooler.py	be8154a8a0 feat: proper embeddings API with e5-mistral-7b support	8 місяців тому
rejection_sampler.py	7253e9052d feat: integrate typical acceptance sampling for spec decoding	7 місяців тому
rotary_embedding.py	4599c98f99 feat: dynamic image size support for VLMs	7 місяців тому
sampler.py	be8154a8a0 feat: proper embeddings API with e5-mistral-7b support	8 місяців тому
spec_decode_base_sampler.py	7253e9052d feat: integrate typical acceptance sampling for spec decoding	7 місяців тому
typical_acceptance_sampler.py	7253e9052d feat: integrate typical acceptance sampling for spec decoding	7 місяців тому
vocab_parallel_embedding.py	0f4a9ee77b quantized lm_head (#582)	7 місяців тому