AlpinDale
|
9aaeb5d349
add speculative config and arg for later
|
9 months ago |
AlpinDale
|
df7ae8ce01
fix spec_decode and block imports
|
9 months ago |
AlpinDale
|
d3351c75f1
fix minor cuda version mismatch with runtime
|
9 months ago |
AlpinDale
|
a304f76d89
feat: Intel CPU support (#403)
|
9 months ago |
AlpinDale
|
fa083286e3
Speculative Decoding Part 4: Lookahead scheduling (#402)
|
9 months ago |
AlpinDale
|
aa1db50131
simplify tokenizer.py
|
9 months ago |
AlpinDale
|
d68fad5a79
feat: add optimized layernorm kernels (#398)
|
9 months ago |
AlpinDale
|
95faf27d2b
fix build for 7.5
|
9 months ago |
AlpinDale
|
ea26c91e52
proper typing
|
9 months ago |
sgsdxzy
|
47370d2ad5
Fix cohere for command-r+ (#394)
|
9 months ago |
AlpinDale
|
2c512a0824
CMake build system (#395)
|
9 months ago |
AlpinDale
|
3abc641d68
directly use in forward pass
|
9 months ago |
sgsdxzy
|
255c2f1d67
small fixes (#393)
|
9 months ago |
AlpinDale
|
04c38f2b91
cache tokenizer len
|
9 months ago |
AlpinDale
|
c6fc4d2c90
fix case when API request to top_k is 0
|
9 months ago |
AlpinDale
|
0f0ec6832b
rccl path for ROCm
|
9 months ago |
AlpinDale
|
a472276ef3
fix codespell path
|
9 months ago |
AlpinDale
|
98a565d5ec
do not use codespell on kernels
|
9 months ago |
AlpinDale
|
c3c374396b
logprobs fixes
|
9 months ago |
AlpinDale
|
d49187c231
this is kinda dumb if you ask me
|
9 months ago |
AlpinDale
|
aa244761ed
formatting and typing
|
9 months ago |
AlpinDale
|
41beab5dc1
add exllamav2 tensor paralell, fused MoE for GPTQ/AWQ
|
9 months ago |
AlpinDale
|
10e708726e
enable multi-node inference
|
9 months ago |
AlpinDale
|
7533d4d458
optional vision language config for neuron
|
9 months ago |
AlpinDale
|
f845a661dd
Chunked Prefill Part 2: data update
|
9 months ago |
AlpinDale
|
bd44122b8e
add qwen2moe support (needs transformers git)
|
9 months ago |
AlpinDale
|
eff5eb16c5
ruff
|
9 months ago |
AlpinDale
|
753f6dc51b
add v2 block manager
|
9 months ago |
AlpinDale
|
211f040107
formatting
|
9 months ago |
AlpinDale
|
aa23ca6ba9
add dbrx support
|
9 months ago |