.. |
attention
|
ca6b69966d
fix: explicitly end_forward() calls to flashinfer
|
7 ay önce |
common
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
distributed
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
endpoints
|
63b735bc2a
chore: optimize v2 block manager to match the performance of v1
|
7 ay önce |
engine
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
executor
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
inputs
|
3a0fdf7b9b
chore: remove `image_input_type` from VLM config
|
7 ay önce |
kv_quant
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
1 yıl önce |
lora
|
0f4a9ee77b
quantized lm_head (#582)
|
6 ay önce |
modeling
|
0f4a9ee77b
quantized lm_head (#582)
|
6 ay önce |
multimodal
|
dd378ea063
feat: MLPSpeculator with tensor parallel
|
7 ay önce |
processing
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
quantization
|
0f4a9ee77b
quantized lm_head (#582)
|
6 ay önce |
spec_decode
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
task_handler
|
ae04f57ec1
feat: Pipeline Parallel support (#581)
|
6 ay önce |
transformers_utils
|
3a0fdf7b9b
chore: remove `image_input_type` from VLM config
|
7 ay önce |
__init__.py
|
a07fc83bc8
chore: proper util for aphrodite version
|
7 ay önce |
_custom_ops.py
|
c0c336aaa3
refactor: registry for processing model inputs; quick_gelu; clip model support
|
7 ay önce |
_ipex_ops.py
|
6a57861fca
feat: initial XPU support via intel_extension_for_pytorch (#571)
|
7 ay önce |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
1 yıl önce |
version.py
|
7e54c3916d
chore: factor out epilogues from cutlass kernels
|
7 ay önce |