AlpinDale c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 年之前
..
quantization c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 年之前
triton_kernel 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 1 年之前
__init__.py 07aa2a492f upstream: add option to specify tokenizer 1 年之前
activation.py 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 1 年之前
attention.py 31c95011a6 feat: FP8 E5M2 KV Cache (#226) 1 年之前
layernorm.py 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 1 年之前
linear.py c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 年之前
rejection.py 95bdd35ec9 feat: rejection sampler (#197) 1 年之前
rotary_embedding.py 8fa608aeb7 feat: replace Ray with NCCL for control plane comms (#221) 1 年之前
sampler.py c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 年之前
vocab_parallel_embedding.py c3a221eb02 feat: GGUF, QuIP#, and Marlin support (#228) 1 年之前