.. |
fused_moe
|
36660b55c2
chore: mixtral fp8 w/ static scales (#542)
|
5 months ago |
ops
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |
activation.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |
layernorm.py
|
e31c6f0b45
feat: refactor modeling logic and support more models (#274)
|
10 months ago |
linear.py
|
b178ae4b4a
chore: generalize linear_method to be quant_method (#540)
|
5 months ago |
logits_processor.py
|
aed64884c6
feat: prompt logprobs with chunked prefill (#539)
|
5 months ago |
rejection.py
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
8 months ago |
rotary_embedding.py
|
3ab36e6b2d
feat: extended RoPE for Llama 3.1 (#543)
|
5 months ago |
sampler.py
|
c2be1b9f29
formatting
|
5 months ago |
vocab_parallel_embedding.py
|
fca911ee0a
vLLM Upstream Sync (#526)
|
6 months ago |