david/aphrodite-engine

Author	SHA1 Message	Date
AlpinDale	7253e9052d feat: integrate typical acceptance sampling for spec decoding	5 months ago
AlpinDale	b8a19ba27f chore: extend aphrodite metrics logging api	5 months ago
AlpinDale	bbde979ecd DeepSeek-V2 (#579)	5 months ago
AlpinDale	b8650ec51d fix: better error message for MLPSpeculator	5 months ago
AlpinDale	0886c361f4 feat: OpenVINO CPU backend (#576)	5 months ago
AlpinDale	c0c336aaa3 refactor: registry for processing model inputs; quick_gelu; clip model support	5 months ago
AlpinDale	51cfadeb29 fix: `MLPSpeculator` handling of `num_speculative_tokens`	5 months ago
AlpinDale	2c321ce1f2 chore: upgrade to rocm 6.1, update docker	5 months ago
AlpinDale	80ac1cdc8f fix: add args for the draft tp	5 months ago
AlpinDale	af43576da0 feat: add MLPSpeculator speculative decoding support (#572)	5 months ago
AlpinDale	0613d91551 fix: kv head calculation with MPT GQA	5 months ago
AlpinDale	6a57861fca feat: initial XPU support via intel_extension_for_pytorch (#571)	5 months ago
AlpinDale	e4407bbcb7 fix: do not start a ray cluster when not using ray	5 months ago
AlpinDale	ee174ea4fd fix: guard for lora + chunked prefill	5 months ago
AlpinDale	a89c9a0e92 fix: device ordinal issues with world_size and stuff	5 months ago
AlpinDale	06ed127441 fix: do not raise optimization warning for fp8 quant	5 months ago
AlpinDale	fe21123a1c feat: TPU support (#570)	5 months ago
AlpinDale	fa58ba87a3 fix: only set executor backend to mp if not multi-node	5 months ago
AlpinDale	bba89fc6d3 chore: make the automatic rope scaling behave properly with rope_scaling arg, add rope theta	5 months ago
AlpinDale	517676249c chore: update the compressed-tensors config	5 months ago
AlpinDale	76d6f49bbb fix: modelscope downloads	5 months ago
AlpinDale	f2e94e2184 chore: minor llava cleanups in preparation for llava-next	5 months ago
AlpinDale	237fa59aea feat: support CPU/GPU swapping in BlockManagerV2	5 months ago
AlpinDale	8d77c69cbd feat: support image processor and add llava example	5 months ago
AlpinDale	690110a051 feat: bitsandbytes quantization	5 months ago
AlpinDale	0307da9e15 refactor: bitsandbytes -> autoquant	5 months ago
AlpinDale	072aec1062 automatically detect sparseml models	5 months ago
AlpinDale	ac79d115b3 add guards for prefix caching, fp8, chunked, etc	5 months ago
AlpinDale	656459fd84 make fp8_e4m3 work on nvidia	6 months ago
AlpinDale	60e74e92fd add rope_scaling arg	6 months ago

Newer Older

Commit History Find

Commit History