.. |
attention
|
66b7bc4415
sliding window in prefix kernel
|
7 months ago |
common
|
42998e423c
better quant verification
|
7 months ago |
distributed
|
096d9eb6c5
enhance nvlink detection
|
7 months ago |
endpoints
|
fb7825df8f
squash logprobs
|
7 months ago |
engine
|
42998e423c
better quant verification
|
7 months ago |
executor
|
f894f7b176
Revert "reduce dedupe by wrapping in general worker class"
|
7 months ago |
kv_quant
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
9 months ago |
lora
|
8be299e78b
fix: lora load check
|
7 months ago |
modeling
|
85a865cc00
feat: fp8 quant
|
7 months ago |
processing
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
quantization
|
c20073824a
cleanup
|
7 months ago |
spec_decode
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
task_handler
|
f894f7b176
Revert "reduce dedupe by wrapping in general worker class"
|
7 months ago |
transformers_utils
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
__init__.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
py.typed
|
1c988a48b2
fix logging and add py.typed
|
1 year ago |