.. |
guided_decoding
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 mesiacov pred |
layers
|
4434c4db84
chore: refactor llama3 rope (#748)
|
3 mesiacov pred |
model_loader
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 mesiacov pred |
models
|
89a2c6dee1
chore: refactor `MultiModalConfig` initialization and profiling (#745)
|
3 mesiacov pred |
__init__.py
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
4 mesiacov pred |
_custom_op.py
|
5d37ec1016
suppress tpu import warning (#696)
|
4 mesiacov pred |
parameter.py
|
4f6020cc86
chore: migrate gptq_marlin to AphroditeParameters (#699)
|
4 mesiacov pred |
pooling_metadata.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
4 mesiacov pred |
sampling_metadata.py
|
2da6a3ec2b
feat: option to apply temperature scaling last (#670)
|
4 mesiacov pred |
utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 mesiacov pred |