.. |
output_processor
|
d4e78a428b
fix: crash when cancelling a request with multi-step (#977)
|
vor 2 Wochen |
__init__.py
|
04b53d2db5
chore: add initializer files
|
vor 1 Jahr |
aphrodite_engine.py
|
b3f6eeb1d2
vlm: increase the default `max_num_batched_tokens` for multimodal models (#973)
|
vor 2 Wochen |
args_tools.py
|
510ae5b949
core: fix chunked prefill not being enabled by default for long contexts (#974)
|
vor 2 Wochen |
async_aphrodite.py
|
0dfa6b60ec
core: support logprobs with multi-step scheduling (#963)
|
vor 2 Wochen |
async_timeout.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
vor 4 Monaten |
metrics.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
vor 1 Monat |
metrics_types.py
|
3d83e64f8e
feat: add metrics for prefix cache hit rate (#829)
|
vor 1 Monat |
protocol.py
|
0dfa6b60ec
core: support logprobs with multi-step scheduling (#963)
|
vor 2 Wochen |