.. |
attention
|
c8c6de64cd
fix: typo in pallas backend
|
пре 7 месеци |
common
|
ee174ea4fd
fix: guard for lora + chunked prefill
|
пре 7 месеци |
distributed
|
44331a4d00
chore: improve p2p cache generation
|
пре 7 месеци |
endpoints
|
a07fc83bc8
chore: proper util for aphrodite version
|
пре 7 месеци |
engine
|
c482c09a3a
fix: remove duplicated input processing in async engine
|
пре 7 месеци |
executor
|
a89c9a0e92
fix: device ordinal issues with world_size and stuff
|
пре 7 месеци |
kv_quant
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
пре 1 година |
lora
|
42d2ee0f43
chore: better error logging for unsupported lora weights
|
пре 7 месеци |
modeling
|
d0cca80b8b
feat: support sharded tensorizer models
|
пре 7 месеци |
multimodal
|
f2e94e2184
chore: minor llava cleanups in preparation for llava-next
|
пре 7 месеци |
processing
|
f9a10145d1
fix: v2 block manager + prefix caching
|
пре 7 месеци |
quantization
|
a33aaf3b42
chore: cleanup compressed tensors
|
пре 7 месеци |
spec_decode
|
4d1e613804
chore: minor simplifications
|
пре 7 месеци |
task_handler
|
34b41e0a87
chore: add coordinator to reduce code duplication in tp and pp
|
пре 7 месеци |
transformers_utils
|
bba89fc6d3
chore: make the automatic rope scaling behave properly with rope_scaling arg, add rope theta
|
пре 7 месеци |
__init__.py
|
a07fc83bc8
chore: proper util for aphrodite version
|
пре 7 месеци |
_custom_ops.py
|
7e54c3916d
chore: factor out epilogues from cutlass kernels
|
пре 7 месеци |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
пре 1 година |
version.py
|
7e54c3916d
chore: factor out epilogues from cutlass kernels
|
пре 7 месеци |