david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ 55fc24b4383d5424263dda89d5f53f59a14c239e

AlpinDale 92cee435e2 rocm: add more quants, fix _scaled_mm call (#1062)		1 개월 전
..
schemes	92cee435e2 rocm: add more quants, fix _scaled_mm call (#1062)	1 개월 전
__init__.py	f1d0b77c92 [0.6.0] Release Candidate (#481)	5 달 전
compressed_tensors.py	f2b6dc3872 cpu: add support for W8A8 quantization via compressed-tensor (#1017)	1 개월 전
compressed_tensors_moe.py	201db10f02 models: add support for Phi3 MoE	1 개월 전
utils.py	93bc863591 feat: Machete Kernels for Hopper GPUs (#842)	2 달 전