AlpinDale
|
dcebb8487d
video modality support
|
vor 3 Wochen |
AlpinDale
|
22a4cd4595
core: fix spec decode metrics and envs circular import (#889)
|
vor 3 Wochen |
AlpinDale
|
abfd4465ca
feat: add support for chunked prefill + prefix caching (#871)
|
vor 1 Monat |
AlpinDale
|
1405051912
attention: add `AttentionState` abstraction (#863)
|
vor 1 Monat |
AlpinDale
|
0a369f9171
feat: support chunked prefill with LoRA (#823)
|
vor 1 Monat |
AlpinDale
|
0f1af04cf5
frontend: minor logging improvements (#787)
|
vor 2 Monaten |
AlpinDale
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
vor 3 Monaten |
AlpinDale
|
0b8b407b6d
feat: support profiling with multiple multi-modal inputs per prompt (#712)
|
vor 4 Monaten |
AlpinDale
|
3693028340
feat: support for Audio modality (#698)
|
vor 4 Monaten |
AlpinDale
|
c2bb886b2e
fix: reinit procedure in `ModelInputForGPUBuilder` (#675)
|
vor 4 Monaten |
AlpinDale
|
bf88c8567e
feat: mamba model support (#674)
|
vor 4 Monaten |
AlpinDale
|
8583aefed7
chore: mamba cache single buffer (#673)
|
vor 4 Monaten |
AlpinDale
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
vor 4 Monaten |
AlpinDale
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
vor 4 Monaten |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 8 Monaten |
AlpinDale
|
78d66f16d1
Chunked Prefill Part 1 (#384)
|
vor 9 Monaten |
AlpinDale
|
9181fa0396
feat: Triton kernels for sampling (#383)
|
vor 9 Monaten |
AlpinDale
|
4b99ac15b7
fix: do not deepcopy metadata
|
vor 9 Monaten |
AlpinDale
|
17b034613d
chore: make metadata a dataclass (#377)
|
vor 9 Monaten |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
vor 9 Monaten |
50h100a
|
b9e0ae87c5
fix fine-grained seeding.
|
vor 9 Monaten |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
vor 10 Monaten |
sgsdxzy
|
50c0875c32
chore: log total memory usage (#316)
|
vor 10 Monaten |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
vor 10 Monaten |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
vor 10 Monaten |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
vor 10 Monaten |
AlpinDale
|
4d04ade9ef
feat: fine-grained seeds (#279)
|
vor 10 Monaten |
AlpinDale
|
697c06c4f5
fix: LoRA support for mixtral (#276)
|
vor 10 Monaten |
AlpinDale
|
4b80b42362
fix: memory leaks due to nccl cuda graphs (#275)
|
vor 10 Monaten |
AlpinDale
|
ea0f57b233
feat: allow further support for non-cuda devices (#247)
|
vor 11 Monaten |