.. |
marlin
|
72229a94da
feat: better marlin kernels (#285)
|
10 months ago |
monitoring
|
9ed45fec7c
fix: incorrect prometheus url
|
11 months ago |
alpaca_template.jinja
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
aphrodite_engine_example.py
|
f32d57ed04
add inference examples
|
1 year ago |
api_client.py
|
f32d57ed04
add inference examples
|
1 year ago |
chatml_template.jinja
|
2755a48d51
merge dev branch into main (#153)
|
1 year ago |
gguf_to_torch.py
|
fe7844f2ef
feat: sharding and safetensors support for gguf conversion (#256)
|
11 months ago |
gradio_server.py
|
551c4280cf
chore: change default port to 2242
|
1 year ago |
offline_inference.py
|
f32d57ed04
add inference examples
|
1 year ago |
perplexity.py
|
2b5af25923
add perplexity example
|
10 months ago |
prefix_cache_example.py
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
11 months ago |
pygchat_template.jinja
|
80e8a14949
feat: add pygchat Jinja template (#218)
|
11 months ago |
slora_inference.py
|
8635901c76
fix: s-lora vocab embeddings
|
11 months ago |