david/aphrodite-engine

Author	SHA1 Message	Date
AlpinDale	ac79d115b3 add guards for prefix caching, fp8, chunked, etc	7 months ago
AlpinDale	f6250c5516 move dockerfiles to root; fix cpu build	7 months ago
AlpinDale	4e1ae004da make mp the default distributed backend	7 months ago
AlpinDale	656459fd84 make fp8_e4m3 work on nvidia	7 months ago
AlpinDale	60e74e92fd add rope_scaling arg	7 months ago
AlpinDale	9e73559eba make use of batched rotary embedding kernels to support long context lora	7 months ago
AlpinDale	7bcff4ac03 implement sharded state dict	7 months ago
AlpinDale	13e5ffd456 fix distributed_executor_backend in args	7 months ago
AlpinDale	a94de94c44 refactor: combine the prefill and decode into a single API (#553)	7 months ago
AlpinDale	c6a501f682 add multiprocessing executor; make ray optional	7 months ago
AlpinDale	0cea453d36 automatically detect tensorized models	7 months ago
AlpinDale	be8154a8a0 feat: proper embeddings API with e5-mistral-7b support	7 months ago
AlpinDale	4acf34417a feat: add DeepSpeedFP quantization for all models	7 months ago
AlpinDale	197a6d2c16 auto disable speculative decoding by the running queue size	7 months ago
AlpinDale	21ce19b3ea blocks_to_copy dict -> torch.Tensor	7 months ago
Brian Dashore	5533ab845e feat: add uvloop (#550)	7 months ago
AlpinDale	35ae01d7ba refactor: attention metadata term	7 months ago
AlpinDale	723c6acb84 re-add ngram speculative decoding	7 months ago
AlpinDale	e87c32bed3 feat: full tensor parallel for LoRA layers (#545)	7 months ago
AlpinDale	46159b107a formatting: pt1	8 months ago
AlpinDale	fca911ee0a vLLM Upstream Sync (#526)	8 months ago
AlpinDale	42998e423c better quant verification	9 months ago
AlpinDale	9d81716bfd [v0.5.3] Release Candidate (#388)	10 months ago
AlpinDale	78d66f16d1 Chunked Prefill Part 1 (#384)	11 months ago
AlpinDale	feb5840f2a feat: async tokenization (#374)	11 months ago
AlpinDale	f8dfac6372 chore: attention refactor and upstream sync apr01 (#365)	11 months ago
AlpinDale	c41462cfcd feat: exllamav2 quantization (#305)	1 year ago
AlpinDale	9810daa699 feat: INT8 KV Cache (#298)	1 year ago
AlpinDale	e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294)	1 year ago
AlpinDale	705821a7fe feat: AQLM quantization support (#293)	1 year ago

Newer Older

Commit History Find

Commit History