.. |
quantization
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 年之前 |
triton_kernel
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 年之前 |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 年之前 |
activation.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 年之前 |
attention.py
|
31c95011a6
feat: FP8 E5M2 KV Cache (#226)
|
1 年之前 |
layernorm.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 年之前 |
linear.py
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 年之前 |
rejection.py
|
95bdd35ec9
feat: rejection sampler (#197)
|
1 年之前 |
rotary_embedding.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
1 年之前 |
sampler.py
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 年之前 |
vocab_parallel_embedding.py
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
1 年之前 |