AlpinDale
|
8442b36171
spec decoding: set the draft model ctxlen to target model
|
1 month ago |
AlpinDale
|
55b7ce56c1
cpu: fix `mm_limits` initialization (#873)
|
1 month ago |
AlpinDale
|
5bd4473bb6
async: avoid premature exit in the async generator (#872)
|
1 month ago |
AlpinDale
|
abfd4465ca
feat: add support for chunked prefill + prefix caching (#871)
|
1 month ago |
AlpinDale
|
ef99a567b6
fix: temp_last warning being repeated for every output token (#869)
|
1 month ago |
Naomiusearch
|
4f9fea4c4d
fix: ROCm build (#817)
|
1 month ago |
50h100a
|
9b569279fd
Merge pull request #868 from PygmalionAI/dry_zoom
|
1 month ago |
50h100a
|
fc3c1cd5a5
this is getting its own commit because lint failures like that are exactly why people stop using linters
|
1 month ago |
50h100a
|
60a5d0fb80
rewrite DRY to be a lot faster
|
1 month ago |
AlpinDale
|
e182d00256
feat: AWQ quantization for InternVL (#867)
|
1 month ago |
AlpinDale
|
9fc6473b18
server: log the process occupying our port (#866)
|
1 month ago |
AlpinDale
|
db96c2daa3
executor: pipe `worker_class_fn` arg in executor (#865)
|
1 month ago |
AlpinDale
|
369600855a
xpu: disable punica kernels for XPU (#864)
|
1 month ago |
AlpinDale
|
1405051912
attention: add `AttentionState` abstraction (#863)
|
1 month ago |
AlpinDale
|
82eabb6aa7
build: add jinja2 to requirements file (#862)
|
1 month ago |
AlpinDale
|
9094a8a2a3
xpu: refactor XPU worker & executor (#861)
|
1 month ago |
AlpinDale
|
8b8d2ce7e2
ci: bump aphrodite version to 0.6.4.post1 (#859)
|
1 month ago |
AlpinDale
|
3392b81bf9
sampler: allow parsing sampler order using strings (#858)
|
1 month ago |
AlpinDale
|
0035dc42ed
sampler: optimize DRY performance using z-algorithm (#856)
|
1 month ago |
AlpinDale
|
2150bb5019
sampler: add range parameter for DRY (#855)
|
1 month ago |
AlpinDale
|
72c505ad84
sampler: fix dry concurrency issue (#852)
|
1 month ago |
Selali
|
14ac216498
sampler: add output_tokens to DRY sampler (#849)
|
1 month ago |
Luke Harold Miles
|
d486d7ac01
docs: add linux arm64/aarch64/GH200 installation tips (#851)
|
1 month ago |
AlpinDale
|
d2971a6831
ci: bump version to 0.6.4 (#845)
|
1 month ago |
AlpinDale
|
538471f76e
chore: bump mistral_common to 1.5.0 (#844)
|
1 month ago |
AlpinDale
|
483c9e6e59
fix: disable awq_marlin override for awq models (#843)
|
1 month ago |
AlpinDale
|
dfa34d1b24
feat: add sampler_priorty (#837)
|
1 month ago |
AlpinDale
|
93bc863591
feat: Machete Kernels for Hopper GPUs (#842)
|
1 month ago |
AlpinDale
|
563e8f7ac8
fix: latency and serving benchmarks (#841)
|
1 month ago |
AlpinDale
|
7c7ec12f36
chore: refactor executor classes for easier inheritance (#840)
|
1 month ago |