david/aphrodite-engine

作者	SHA1 備註	提交日期
AlpinDale	b03b4d4c8c fix: compute cutlass 3.x epilogues in fp32 instead of 16	7 月之前
AlpinDale	cdff8e89f9 feat: introduce `DraftModelRunner`	7 月之前
AlpinDale	9868bb2290 chore: make it clear that '%' should NOT be in tensor dict keys	7 月之前
AlpinDale	b8650ec51d fix: better error message for MLPSpeculator	7 月之前
AlpinDale	0886c361f4 feat: OpenVINO CPU backend (#576)	7 月之前
AlpinDale	d63690a0df chore: add fp8 examples	7 月之前
AlpinDale	b6ff0623a6 chore: clean up branding	7 月之前
AlpinDale	85ef2fe8b1 chore: clean up placeholder symbols	7 月之前
AlpinDale	1852d18326 chore: clean up inference examples	7 月之前
AlpinDale	e1c4cf1d50 chore: organize chat templates	7 月之前
AlpinDale	c0c336aaa3 refactor: registry for processing model inputs; quick_gelu; clip model support	7 月之前
AlpinDale	426a13ab73 fix: pass multi_modal_kwargs to CPU model runner	7 月之前
AlpinDale	bb4da84623 fix: make sure multi modal kwargs can broadcast properly with ring buffer	7 月之前
AlpinDale	d2461161ec chore: optimize KV cache swapping for TPU	7 月之前
AlpinDale	fad45609b8 chore: remove logical token blocks (turns out they are not needed)	7 月之前
AlpinDale	b3643a7bd7 fix: min_tokens for when there are multiple eos tokens	7 月之前
AlpinDale	51cfadeb29 fix: `MLPSpeculator` handling of `num_speculative_tokens`	7 月之前
AlpinDale	c5d8028668 fix: no need to redefine supports_vision and supports_lora in model class	7 月之前
AlpinDale	b81966c0da fix: missed phi3v	7 月之前
AlpinDale	56e0b8223c chore: add base class for LoRA-supported models	7 月之前
AlpinDale	bc5ac9584a fix: make tensor_dict flattening/unflattening more generic	7 月之前
AlpinDale	dead030abf fix: cuda graph with MLPSpeculator	7 月之前
AlpinDale	271a680026 feat: inference support for PowerPC ISA	7 月之前
AlpinDale	8b626e4032 fix: cpu kv cache allocation for TPU	7 月之前
AlpinDale	fcd58614f4 feat: support parallel sampling and swapping in TPU	7 月之前
AlpinDale	5b464d36ea feat: bias epilogue support for cutlass kernels	7 月之前
AlpinDale	b16173b41e chore: add minimum concurrency for XPU	7 月之前
AlpinDale	af1286f9fa fix: kv cache size calculation on TPUs	7 月之前
AlpinDale	ecd4460d55 fix: support 2D inputs for embeddings	7 月之前
AlpinDale	66be475aae fix: shm broadcast when the queue size is full	7 月之前

更新的提交更舊的提交

Commit History 查找

Commit History