.. |
attention
|
ac79d115b3
add guards for prefix caching, fp8, chunked, etc
|
5 月之前 |
common
|
ac79d115b3
add guards for prefix caching, fp8, chunked, etc
|
5 月之前 |
distributed
|
5b0c11d190
support pipeline parallel pynccl groups
|
5 月之前 |
endpoints
|
696f2cd59c
add phi3_small support with blocksparse attention
|
5 月之前 |
engine
|
ac79d115b3
add guards for prefix caching, fp8, chunked, etc
|
5 月之前 |
executor
|
5b0c11d190
support pipeline parallel pynccl groups
|
5 月之前 |
kv_quant
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
10 月之前 |
lora
|
5b0c11d190
support pipeline parallel pynccl groups
|
5 月之前 |
modeling
|
9bbc75d2e3
wip
|
5 月之前 |
processing
|
0d15aa3ab3
fix prefix caching for block manager v2
|
5 月之前 |
quantization
|
5884e0b904
add bitnetforcausallm support
|
5 月之前 |
spec_decode
|
344ddaac5a
properly disable speculative decoding
|
5 月之前 |
task_handler
|
5b0c11d190
support pipeline parallel pynccl groups
|
5 月之前 |
transformers_utils
|
5884e0b904
add bitnetforcausallm support
|
5 月之前 |
__init__.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
5 月之前 |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
1 年之前 |