.. |
adapter_commons
|
99680b2d23
feat: soft prompts (#589)
|
há 6 meses atrás |
attention
|
2105e4fd6b
feat: correctly invoke prefill & decode kernels for cross-attention
|
há 6 meses atrás |
common
|
16dff9babc
chore: enable bonus token in spec decoding for KV cache based models
|
há 6 meses atrás |
distributed
|
dba22e4f83
fix: add zeromq fallback for broadcasting large objects (e.g. vlm images)
|
há 6 meses atrás |
endpoints
|
a3b56353fa
fix: another one missed
|
há 6 meses atrás |
engine
|
c0c2b1ac20
fix: get_and_reset only when scheduler outputs are not empty
|
há 6 meses atrás |
executor
|
4501ae5f15
fix: neuron executor for adapters
|
há 6 meses atrás |
inputs
|
4f7d212b70
feat: remove vision language config
|
há 6 meses atrás |
kv_quant
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
há 1 ano atrás |
lora
|
99680b2d23
feat: soft prompts (#589)
|
há 6 meses atrás |
modeling
|
db73f03cdc
fix: use ParallelLMHead for MLPSpeculator
|
há 6 meses atrás |
multimodal
|
c11a8bdaad
fix: calculate max number of multi-modal tokens automatically
|
há 6 meses atrás |
platforms
|
1a40bf438b
fix: incorrect gpu capability when used mixed gpus
|
há 6 meses atrás |
processing
|
99680b2d23
feat: soft prompts (#589)
|
há 6 meses atrás |
prompt_adapter
|
99680b2d23
feat: soft prompts (#589)
|
há 6 meses atrás |
quantization
|
058e629f8e
chore: refactor marlin python utils
|
há 6 meses atrás |
spec_decode
|
16dff9babc
chore: enable bonus token in spec decoding for KV cache based models
|
há 6 meses atrás |
task_handler
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
há 6 meses atrás |
transformers_utils
|
d9f4c36edd
feat: Medusa speculative decoding support (#590)
|
há 6 meses atrás |
__init__.py
|
a07fc83bc8
chore: proper util for aphrodite version
|
há 7 meses atrás |
_custom_ops.py
|
ad24e74a99
feat: FP8 weight-only quantization support for Ampere GPUs
|
há 6 meses atrás |
_ipex_ops.py
|
6a57861fca
feat: initial XPU support via intel_extension_for_pytorch (#571)
|
há 7 meses atrás |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
há 1 ano atrás |
version.py
|
7e54c3916d
chore: factor out epilogues from cutlass kernels
|
há 7 meses atrás |