.. |
fused_moe
|
4abbbdad78
chore: make triton fully optional
|
4 months ago |
mamba
|
2dfa4e47e6
chore: set seed for dummy weights init
|
4 months ago |
ops
|
4abbbdad78
chore: make triton fully optional
|
4 months ago |
__init__.py
|
07aa2a492f
upstream: add option to specify tokenizer
|
1 year ago |
activation.py
|
6b1fdd07bd
chore: add isort and refactor formatting script and utils
|
4 months ago |
layernorm.py
|
5761ef8c35
feat: gemma-2 support
|
4 months ago |
linear.py
|
0e6c400b13
feat: re-add GGUF (#600)
|
4 months ago |
logits_processor.py
|
4d4e767838
ci: take one of fixing lint issues
|
4 months ago |
pooler.py
|
be8154a8a0
feat: proper embeddings API with e5-mistral-7b support
|
5 months ago |
rejection_sampler.py
|
d8a51d05a7
fix: seeded gens with pipeline parallel
|
4 months ago |
rotary_embedding.py
|
18b45266bb
feat: add nemotron HF support (#606)
|
4 months ago |
sampler.py
|
4abbbdad78
chore: make triton fully optional
|
4 months ago |
spec_decode_base_sampler.py
|
d8a51d05a7
fix: seeded gens with pipeline parallel
|
4 months ago |
typical_acceptance_sampler.py
|
4d4e767838
ci: take one of fixing lint issues
|
4 months ago |
vocab_parallel_embedding.py
|
0e6c400b13
feat: re-add GGUF (#600)
|
4 months ago |