david/aphrodite-engine

Author	SHA1 Message	Date
AlpinDale	b0f262eec1 feat: FP8 quantization support for AMD ROCm (#729)	4 months ago
AlpinDale	28b6397188 chore: quant config for speculative draft models (#719)	4 months ago
AlpinDale	577586309d chore: multi-step args and sequence modifications (#713)	4 months ago
AlpinDale	0b8b407b6d feat: support profiling with multiple multi-modal inputs per prompt (#712)	4 months ago
AlpinDale	d5033e12fd feat: implement mistral tokenizer mode (#711)	4 months ago
AlpinDale	5d37ec1016 suppress tpu import warning (#696)	4 months ago
AlpinDale	4fe371b7fa fix: allow passing float for GiB arguments (#690)	4 months ago
AlpinDale	bf88c8567e feat: mamba model support (#674)	4 months ago
AlpinDale	e200775863 feat: enable using fp8 kv and prefix caching with chunked prefill (#668)	4 months ago
AlpinDale	62111fab17 feat: allow serving encoder-decoder models in the API server (#664)	4 months ago
AlpinDale	3f49a55f82 feat: add INT8 W8A16 quant for TPU (#663)	4 months ago
AlpinDale	3405782f24 fix: max_num_batched_tokens should not be limited for lora (#658)	4 months ago
AlpinDale	a0e446a17d feat: initial encoder-decoder support with BART model (#633)	4 months ago
AlpinDale	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 months ago
AlpinDale	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
AlpinDale	78d66f16d1 Chunked Prefill Part 1 (#384)	9 months ago
AlpinDale	feb5840f2a feat: async tokenization (#374)	9 months ago
AlpinDale	29c241c115 fix: explicitly disallow installation on non-linux platforms (#373)	9 months ago
AlpinDale	97a2b26c97 fix: assertion error when use_sliding_window is present	9 months ago
AlpinDale	0f6d56b07f feat: model executor refactor (#367)	9 months ago
AlpinDale	f8dfac6372 chore: attention refactor and upstream sync apr01 (#365)	9 months ago
AlpinDale	c41462cfcd feat: exllamav2 quantization (#305)	10 months ago
AlpinDale	c2d77b1822 chore: logging refactor (#302)	10 months ago
AlpinDale	a98babfb74 fix: bnb on Turing GPUs (#299)	10 months ago
AlpinDale	9810daa699 feat: INT8 KV Cache (#298)	10 months ago
AlpinDale	e0c35bb353 feat: bitsandbytes and `--load-in{4,8}bit` support (#294)	10 months ago
AlpinDale	705821a7fe feat: AQLM quantization support (#293)	10 months ago
AlpinDale	a1d8ab9f3e fix: lora on quantized models (barred gguf) (#292)	10 months ago
AlpinDale	ac82b67f75 feat: naive context shift and various QoL changes (#289)	10 months ago
AlpinDale	72229a94da feat: better marlin kernels (#285)	10 months ago

Newer Older

Commit History Find

Commit History