.. |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 anno fa |
block.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 mesi fa |
config.py
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
11 mesi fa |
grammar.py
|
0adab894fe
feat: grammar support (#206)
|
11 mesi fa |
logger.py
|
8834ecf9de
chore: clean up refactor endpoints (#98)
|
1 anno fa |
logits_processor.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 mesi fa |
outputs.py
|
c0aac15421
feat: S-LoRA support (#222)
|
11 mesi fa |
prefix.py
|
c0aac15421
feat: S-LoRA support (#222)
|
11 mesi fa |
sampling_params.py
|
1c46fa31ad
feat: add quadratic sampling (#233)
|
11 mesi fa |
sequence.py
|
c0aac15421
feat: S-LoRA support (#222)
|
11 mesi fa |
test_utils.py
|
641bb0f6e9
feat: add custom allreduce kernels (#224)
|
11 mesi fa |
utils.py
|
31c95011a6
feat: FP8 E5M2 KV Cache (#226)
|
11 mesi fa |