.. |
attention
|
75f97bc25d
bump flash-attn to remove unnecessary copies in the backend
|
7 maanden geleden |
common
|
76d6f49bbb
fix: modelscope downloads
|
7 maanden geleden |
distributed
|
b2fd915c35
improve p2p access check
|
7 maanden geleden |
endpoints
|
1d7f5c45b0
feat: add stream_options for chat completions
|
7 maanden geleden |
engine
|
d7ebffe2f0
chore: re-add the graceful engine shutdown
|
7 maanden geleden |
executor
|
17eb1b7eb9
chore: remove ray health check
|
7 maanden geleden |
kv_quant
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
1 jaar geleden |
lora
|
c975bba905
fix: sharded state loader with lora
|
7 maanden geleden |
modeling
|
c975bba905
fix: sharded state loader with lora
|
7 maanden geleden |
multimodal
|
f2e94e2184
chore: minor llava cleanups in preparation for llava-next
|
7 maanden geleden |
processing
|
3f92035bf1
fix: add `ignored_seq_groups` in `_schedule_chunked_prefill`
|
7 maanden geleden |
quantization
|
40bc98b363
chore: use cutlass kernels for fp8 if supported
|
7 maanden geleden |
spec_decode
|
ec5b99d075
fix: use named args
|
7 maanden geleden |
task_handler
|
c975bba905
fix: sharded state loader with lora
|
7 maanden geleden |
transformers_utils
|
76d6f49bbb
fix: modelscope downloads
|
7 maanden geleden |
__init__.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
7 maanden geleden |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
1 jaar geleden |