.. |
guided_decoding
|
313e198557
api: implement OpenAI-compatible tools API for Hermes/Mistral models (#993)
|
1 개월 전 |
layers
|
172dee2573
(2/N) Triton Backend: integrate Triton activation kernels (#1126)
|
3 일 전 |
model_loader
|
349a612338
chore: bump bitsandbytes version to latest; enable cuda graphs for 4bit bnb (#1123)
|
5 일 전 |
models
|
fa84f8102e
kernels: split marlin kernels for faster compile, fix MoE, temporarily remove HQQ (#1119)
|
5 일 전 |
__init__.py
|
22a4cd4595
core: fix spec decode metrics and envs circular import (#889)
|
1 개월 전 |
_custom_op.py
|
172dee2573
(2/N) Triton Backend: integrate Triton activation kernels (#1126)
|
3 일 전 |
parameter.py
|
83af2524f3
quants: add GPTQ and FBGEMM to AphroditeParameters (#987)
|
1 개월 전 |
pooling_metadata.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
5 달 전 |
sampling_metadata.py
|
f20f5c3491
samplers: improved DRY performance (#1108)
|
2 주 전 |
utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
9 달 전 |