david/aphrodite-engine

Author	SHA1 Message	Date
AlpinDale	13e5ffd456 fix distributed_executor_backend in args	6 months ago
AlpinDale	a94de94c44 refactor: combine the prefill and decode into a single API (#553)	6 months ago
AlpinDale	c6a501f682 add multiprocessing executor; make ray optional	6 months ago
AlpinDale	0cea453d36 automatically detect tensorized models	6 months ago
AlpinDale	be8154a8a0 feat: proper embeddings API with e5-mistral-7b support	6 months ago
AlpinDale	4acf34417a feat: add DeepSpeedFP quantization for all models	6 months ago
AlpinDale	197a6d2c16 auto disable speculative decoding by the running queue size	6 months ago
AlpinDale	21ce19b3ea blocks_to_copy dict -> torch.Tensor	6 months ago
Brian Dashore	5533ab845e feat: add uvloop (#550)	6 months ago
AlpinDale	35ae01d7ba refactor: attention metadata term	6 months ago
AlpinDale	723c6acb84 re-add ngram speculative decoding	6 months ago
AlpinDale	e87c32bed3 feat: full tensor parallel for LoRA layers (#545)	6 months ago
AlpinDale	46159b107a formatting: pt1	7 months ago
AlpinDale	fca911ee0a vLLM Upstream Sync (#526)	7 months ago
AlpinDale	42998e423c better quant verification	8 months ago
AlpinDale	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
AlpinDale	78d66f16d1 Chunked Prefill Part 1 (#384)	10 months ago
AlpinDale	feb5840f2a feat: async tokenization (#374)	10 months ago
AlpinDale	f8dfac6372 chore: attention refactor and upstream sync apr01 (#365)	10 months ago
AlpinDale	c41462cfcd feat: exllamav2 quantization (#305)	10 months ago
AlpinDale	9810daa699 feat: INT8 KV Cache (#298)	11 months ago
AlpinDale	e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294)	11 months ago
AlpinDale	705821a7fe feat: AQLM quantization support (#293)	11 months ago
AlpinDale	ac82b67f75 feat: naive context shift and various QoL changes (#289)	11 months ago
AlpinDale	72229a94da feat: better marlin kernels (#285)	11 months ago
AlpinDale	657aec0cbd refactor: OpenAI endpoint (#261)	11 months ago
AlpinDale	4d04ade9ef feat: fine-grained seeds (#279)	11 months ago
AlpinDale	ea0f57b233 feat: allow further support for non-cuda devices (#247)	1 year ago
AlpinDale	c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228)	1 year ago
AlpinDale	31c95011a6 feat: FP8 E5M2 KV Cache (#226)	1 year ago

Newer Older

Commit History Find

Commit History