.. |
fused_moe
|
1efd0f89b7
feat: support FP8 for DeepSeekV2 MoE
|
6 月之前 |
mamba
|
5be90c3859
Mamba infrastrucuture support (#586)
|
6 月之前 |
ops
|
fca911ee0a
vLLM Upstream Sync (#526)
|
8 月之前 |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 年之前 |
activation.py
|
c0c336aaa3
refactor: registry for processing model inputs; quick_gelu; clip model support
|
7 月之前 |
layernorm.py
|
5761ef8c35
feat: gemma-2 support
|
6 月之前 |
linear.py
|
d2f38f6f81
chore: remove separate bias add
|
6 月之前 |
logits_processor.py
|
5761ef8c35
feat: gemma-2 support
|
6 月之前 |
pooler.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
7 月之前 |
rejection_sampler.py
|
7253e9052d
feat: integrate typical acceptance sampling for spec decoding
|
7 月之前 |
rotary_embedding.py
|
5761ef8c35
feat: gemma-2 support
|
6 月之前 |
sampler.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
7 月之前 |
spec_decode_base_sampler.py
|
7253e9052d
feat: integrate typical acceptance sampling for spec decoding
|
7 月之前 |
typical_acceptance_sampler.py
|
7253e9052d
feat: integrate typical acceptance sampling for spec decoding
|
7 月之前 |
vocab_parallel_embedding.py
|
0f4a9ee77b
quantized lm_head (#582)
|
6 月之前 |