david/aphrodite-engine: PygmalionAI's large-scale inference engine pygmalion.chat It is designed to serve as the inference endpoint for the PygmalionAI website, and to allow serving the Pygmalion models to a large number of users with blazing fast speeds (thanks to vLLM's Paged Attention). @ rep_pen_range

AlpinDale ccbda97416 fix: types in AQLM and GGUF for dynamo support (#736)		3 月之前
..
all_reduce	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
attention	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
backup	f8dfac6372 chore: attention refactor and upstream sync apr01 (#365)	9 月之前
core	9296d4b25d feat: dynamo support for ScalarType (#733)	3 月之前
cpu	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
hadamard	5d288aa76c feat: add fast hadamard transformation kernels (#232)	11 月之前
mamba	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
moe	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
prepare_inputs	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
punica	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
quantization	ccbda97416 fix: types in AQLM and GGUF for dynamo support (#736)	3 月之前
activation_kernels.cu	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
cache.h	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
cache_kernels.cu	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
cuda_compat.h	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
cuda_utils.h	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
cuda_utils_kernels.cu	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
dispatch_utils.h	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
layernorm_kernels.cu	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
ops.h	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
pos_encoding_kernels.cu	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
reduction.cuh	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 月之前
torch_bindings.cpp	a401f8e05d feat: per-tensor token epilogue kernels (#630)	4 月之前