AlpinDale 1efd0f89b7 feat: support FP8 for DeepSeekV2 MoE 6 月之前
..
fused_moe 1efd0f89b7 feat: support FP8 for DeepSeekV2 MoE 6 月之前
mamba 5be90c3859 Mamba infrastrucuture support (#586) 6 月之前
ops fca911ee0a vLLM Upstream Sync (#526) 8 月之前
__init__.py 07aa2a492f upstream: add option to specify tokenizer 1 年之前
activation.py c0c336aaa3 refactor: registry for processing model inputs; quick_gelu; clip model support 7 月之前
layernorm.py 5761ef8c35 feat: gemma-2 support 6 月之前
linear.py d2f38f6f81 chore: remove separate bias add 6 月之前
logits_processor.py 5761ef8c35 feat: gemma-2 support 6 月之前
pooler.py be8154a8a0 feat: proper embeddings API with e5-mistral-7b support 7 月之前
rejection_sampler.py 7253e9052d feat: integrate typical acceptance sampling for spec decoding 7 月之前
rotary_embedding.py 5761ef8c35 feat: gemma-2 support 6 月之前
sampler.py be8154a8a0 feat: proper embeddings API with e5-mistral-7b support 7 月之前
spec_decode_base_sampler.py 7253e9052d feat: integrate typical acceptance sampling for spec decoding 7 月之前
typical_acceptance_sampler.py 7253e9052d feat: integrate typical acceptance sampling for spec decoding 7 月之前
vocab_parallel_embedding.py 0f4a9ee77b quantized lm_head (#582) 6 月之前