.. |
guided_decoding
|
313e198557
api: implement OpenAI-compatible tools API for Hermes/Mistral models (#993)
|
vor 1 Monat |
layers
|
172dee2573
(2/N) Triton Backend: integrate Triton activation kernels (#1126)
|
vor 3 Tagen |
model_loader
|
349a612338
chore: bump bitsandbytes version to latest; enable cuda graphs for 4bit bnb (#1123)
|
vor 5 Tagen |
models
|
fa84f8102e
kernels: split marlin kernels for faster compile, fix MoE, temporarily remove HQQ (#1119)
|
vor 5 Tagen |
__init__.py
|
22a4cd4595
core: fix spec decode metrics and envs circular import (#889)
|
vor 1 Monat |
_custom_op.py
|
172dee2573
(2/N) Triton Backend: integrate Triton activation kernels (#1126)
|
vor 3 Tagen |
parameter.py
|
83af2524f3
quants: add GPTQ and FBGEMM to AphroditeParameters (#987)
|
vor 1 Monat |
pooling_metadata.py
|
f1d0b77c92
[0.6.0] Release Candidate (#481)
|
vor 5 Monaten |
sampling_metadata.py
|
f20f5c3491
samplers: improved DRY performance (#1108)
|
vor 2 Wochen |
utils.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
vor 9 Monaten |