david/aphrodite-engine

Author	SHA1 Message	Date
AlpinDale	9bbc75d2e3 wip	5 months ago
AlpinDale	60af35bc34 wip	5 months ago
AlpinDale	74cb1aad4e wip	5 months ago
AlpinDale	5d965b34a7 bitnet -> bitblas in reqs	5 months ago
AlpinDale	5884e0b904 add bitnetforcausallm support	5 months ago
AlpinDale	2649f3f14e aqlm works on pascal	5 months ago
AlpinDale	ac79d115b3 add guards for prefix caching, fp8, chunked, etc	5 months ago
AlpinDale	344ddaac5a properly disable speculative decoding	5 months ago
AlpinDale	696f2cd59c add phi3_small support with blocksparse attention	5 months ago
AlpinDale	0d15aa3ab3 fix prefix caching for block manager v2	5 months ago
AlpinDale	7d0884de9a fix mistral v0.3 weight loading	5 months ago
AlpinDale	e8b7f53321 allow prompt token IDs in the logits processor api	5 months ago
AlpinDale	f4ea11b982 feat: initial support for activation quantization	5 months ago
Drake	e1a142c179 Fix OpenAI chat completions compatibility (#559)	5 months ago
AlpinDale	5b0c11d190 support pipeline parallel pynccl groups	5 months ago
AlpinDale	f6250c5516 move dockerfiles to root; fix cpu build	5 months ago
AlpinDale	d8667fcb98 improve gptq_marlin_24 prefill performance	5 months ago
AlpinDale	eb2c5c77df feat: enforce the max possible seqlen	5 months ago
AlpinDale	19a959a03e prioritize user selection for attention	5 months ago
AlpinDale	c1ed789835 fix: typo in llama.py	5 months ago
AlpinDale	4e1ae004da make mp the default distributed backend	5 months ago
AlpinDale	de62ceb18c refactor: eliminate parallel worker per-step task scheduling overhead	5 months ago
AlpinDale	656459fd84 make fp8_e4m3 work on nvidia	5 months ago
AlpinDale	6e626b902c fix cutlass w8a8 kernels for cuda stream	5 months ago
AlpinDale	3bdeb3e116 fix: clang formatting for all kernels (#558)	5 months ago
AlpinDale	04d22bf1a9 add clang-format	5 months ago
AlpinDale	60e74e92fd add rope_scaling arg	5 months ago
AlpinDale	b8b63eb5ca fix head_size check for flash attention backend	5 months ago
AlpinDale	8077af0b2f add lora support for phi	5 months ago
AlpinDale	295cfb2f39 add rope scaling for qwen2	5 months ago

Newer Older

Commit History Find

Commit History