.. |
guided_decoding
|
0256ed236b
feat: windows support (#790)
|
il y a 2 mois |
layers
|
ef99a567b6
fix: temp_last warning being repeated for every output token (#869)
|
il y a 1 mois |
model_loader
|
e182d00256
feat: AWQ quantization for InternVL (#867)
|
il y a 1 mois |
models
|
e182d00256
feat: AWQ quantization for InternVL (#867)
|
il y a 1 mois |
__init__.py
|
7df7b8ca53
optimization: reduce end-to-end overhead from python obj allocation (#666)
|
il y a 4 mois |
_custom_op.py
|
5d37ec1016
suppress tpu import warning (#696)
|
il y a 4 mois |
parameter.py
|
93bc863591
feat: Machete Kernels for Hopper GPUs (#842)
|
il y a 1 mois |
pooling_metadata.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
il y a 4 mois |
sampling_metadata.py
|
2150bb5019
sampler: add range parameter for DRY (#855)
|
il y a 1 mois |
utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
il y a 8 mois |