david/aphrodite-engine

Author	SHA1 Message	Date
AlpinDale	9bdf8d5bfa mamba: enable continuous batching for mamba kernels (#1055)	1 week ago
AlpinDale	a985143768 core: add cuda graph support for encoder-decoder models (#1051)	1 week ago
AlpinDale	4593a3b306 chore: remove dead code from triton sampling kernels (#1049)	1 week ago
AlpinDale	638c08d9dc fix: clean shutdown issues (#1047)	1 week ago
AlpinDale	65a59bbb6b cpu: raise error if using encoder-decoder models (#1027)	1 week ago
AlpinDale	f1ea7711bd core: do not compile ScalarType for torch < 2.4.0 (#938)	2 weeks ago
AlpinDale	22a4cd4595 core: fix spec decode metrics and envs circular import (#889)	3 weeks ago
AlpinDale	901900854e chore: consolidate environment variables within one file (#882)	4 weeks ago
AlpinDale	9fc6473b18 server: log the process occupying our port (#866)	1 month ago
AlpinDale	0f1af04cf5 frontend: minor logging improvements (#787)	2 months ago
AlpinDale	0256ed236b feat: windows support (#790)	2 months ago
50h100a	371d57af82 filesize-driven progress bar for loading tensors	2 months ago
AlpinDale	0b8b407b6d feat: support profiling with multiple multi-modal inputs per prompt (#712)	3 months ago
AlpinDale	5d37ec1016 suppress tpu import warning (#696)	4 months ago
AlpinDale	4fe371b7fa fix: allow passing float for GiB arguments (#690)	4 months ago
AlpinDale	3f712cd287 feat: add progress bar for loading individual weight modules (#640)	4 months ago
AlpinDale	7df7b8ca53 optimization: reduce end-to-end overhead from python obj allocation (#666)	4 months ago
AlpinDale	62111fab17 feat: allow serving encoder-decoder models in the API server (#664)	4 months ago
AlpinDale	0e5bb11503 fix: make `merge_async_iterators.is_cancelled()` optional (#656)	4 months ago
AlpinDale	a2344d3617 fix: move zeromq rpc frontend to IPC instead of TCP (#652)	4 months ago
AlpinDale	31f82da8bd chore: deduplicate nvlink check to cuda platform (#643)	4 months ago
AlpinDale	77c4fbd5c9 fix: better async request cancellation (#641)	4 months ago
AlpinDale	308501daa5 fix: default api port and attention selector (#634)	4 months ago
AlpinDale	a0e446a17d feat: initial encoder-decoder support with BART model (#633)	4 months ago
AlpinDale	f1d0b77c92 [0.6.0] Release Candidate (#481)	4 months ago
AlpinDale	9d81716bfd [v0.5.3] Release Candidate (#388)	8 months ago
AlpinDale	e3252edd07 fix: remove event and stream, add typing (#382)	9 months ago
AlpinDale	33b3786175 fix: cache neuron checks (#379)	9 months ago
AlpinDale	f8dfac6372 chore: attention refactor and upstream sync apr01 (#365)	9 months ago
AlpinDale	e53842bd5d fix: cuda home detection for fp8 kv cache	9 months ago

Newer Older

Commit History Find

Commit History