david/aphrodite-engine

Auteur	SHA1 Message	Date
AlpinDale	c2bb886b2e fix: reinit procedure in `ModelInputForGPUBuilder` (#675)	il y a 6 mois
AlpinDale	bf88c8567e feat: mamba model support (#674)	il y a 6 mois
AlpinDale	8583aefed7 chore: mamba cache single buffer (#673)	il y a 6 mois
AlpinDale	19ad952dd4 chore: better stream termination in async engine (#672)	il y a 6 mois
AlpinDale	1394008421 chore: decouple `should_modify_greedy_probs_inplace (#671)	il y a 6 mois
AlpinDale	2da6a3ec2b feat: option to apply temperature scaling last (#670)	il y a 6 mois
AlpinDale	e3a53712f2 fix: mlpspeculator with padded vocab (#669)	il y a 6 mois
AlpinDale	e200775863 feat: enable using fp8 kv and prefix caching with chunked prefill (#668)	il y a 6 mois
AlpinDale	ef40c05cd3 fix: minor adjustments to scheduler and block manager (#667)	il y a 6 mois
AlpinDale	7df7b8ca53 optimization: reduce end-to-end overhead from python obj allocation (#666)	il y a 6 mois
AlpinDale	ea78357d70 fix: deps with TPU dockerfile (#665)	il y a 6 mois
AlpinDale	62111fab17 feat: allow serving encoder-decoder models in the API server (#664)	il y a 6 mois
AlpinDale	3f49a55f82 feat: add INT8 W8A16 quant for TPU (#663)	il y a 6 mois
AlpinDale	5dd0145414 chore: update the env.py script and the bug report template (#662)	il y a 6 mois
AlpinDale	1927ce2be4 fix: `get_num_blocks_touched` logic (#661)	il y a 6 mois
AlpinDale	ed9a6f97c1 fix: kill api server when pinging dead engine (#660)	il y a 6 mois
AlpinDale	6d54f7687d fix: lora with pipeline parallel (#659)	il y a 6 mois
AlpinDale	3405782f24 fix: max_num_batched_tokens should not be limited for lora (#658)	il y a 6 mois
AlpinDale	67ee885293 fix: flashinfer outputs (#657)	il y a 6 mois
AlpinDale	0e5bb11503 fix: make `merge_async_iterators.is_cancelled()` optional (#656)	il y a 6 mois
AlpinDale	3170c0d4c6 fix: GPTQ/AWQ on Colab (#655)	il y a 6 mois
AlpinDale	83bcb9119a fix: multiprocessing timeout (#654)	il y a 6 mois
AlpinDale	1e119cbeb6 fix: input processor in internvl2 (#653)	il y a 6 mois
AlpinDale	a2344d3617 fix: move zeromq rpc frontend to IPC instead of TCP (#652)	il y a 6 mois
AlpinDale	f1e1d0bd3d feat: introduce `BaseAphroditeParameter` (#646)	il y a 6 mois
AlpinDale	47ac074937 fix: RSLoRA support (#647)	il y a 6 mois
50h100a	b96ba9930e Merge pull request #644 from 50h100a/quadfix	il y a 6 mois
AlpinDale	59264d32e9 fix: hardcoded float16 in embedding mode check (#645)	il y a 6 mois
50h100a	cbdf2d986f quadratic sampling: separate diff from logits to avoid NaNs.	il y a 6 mois
AlpinDale	31f82da8bd chore: deduplicate nvlink check to cuda platform (#643)	il y a 6 mois

Récemment Précédemment

Historique des commits Trouver

Historique des commits