AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
hai 10 meses |
AlpinDale
|
13d850334e
fix: navi support (#283)
|
hai 10 meses |
AlpinDale
|
ea0f57b233
feat: allow further support for non-cuda devices (#247)
|
hai 11 meses |
AlpinDale
|
31c95011a6
feat: FP8 E5M2 KV Cache (#226)
|
hai 11 meses |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
hai 11 meses |
AlpinDale
|
15a0454172
feat: FP8 KV Cache (#185)
|
hai 1 ano |
AlpinDale
|
7d91e9e0f2
feat: CUDA graphs (#172)
|
hai 1 ano |
AlpinDale
|
02f3ab3501
fix: replace head_mapping with num_kv_heads (#161)
|
hai 1 ano |
AlpinDale
|
653da510d1
chore: rewrite InputMetadata (#143)
|
hai 1 ano |
AlpinDale
|
1334a833a4
feat: AMD ROCm support (#95)
|
hai 1 ano |
AlpinDale
|
8b2bbbd98b
chore: attention rewrite + models (#135)
|
hai 1 ano |
AlpinDale
|
1aab8a7d6f
feat: speedup compilation times by 3x (#130)
|
hai 1 ano |
AlpinDale
|
9ec4e08ade
fix: cpu sync delay fix (#127)
|
hai 1 ano |
AlpinDale
|
7e8483c6d7
chore: rope refactors (#114)
|
hai 1 ano |
AlpinDale
|
5175605f8d
fix: yarn (#112)
|
hai 1 ano |
AlpinDale
|
f384f3ae60
fix: force v2 for ctxlen larger than 8192 (#100)
|
hai 1 ano |
AlpinDale
|
74604eb252
fix: pylint complaints (#91)
|
hai 1 ano |
AlpinDale
|
3d72f05c7b
feat: flattened 1D tensor -> 2D tensor (#85)
|
hai 1 ano |
AlpinDale
|
4e71bd1d12
feat: add PagedAttention V2 kernels (#76)
|
hai 1 ano |
AlpinDale
|
bdad759503
feat: YaRN context window extension support (#67)
|
hai 1 ano |
AlpinDale
|
cbeeabeb9a
feat: mistral support (#20)
|
hai 1 ano |
AlpinDale
|
472899e4bd
import any
|
hai 1 ano |
AlpinDale
|
45e72151a4
import dict
|
hai 1 ano |
AlpinDale
|
75c27d3e65
massive overhaul
|
hai 1 ano |
AlpinDale
|
e77960c57e
use float datatype for RoPE
|
hai 1 ano |
AlpinDale
|
57b5ef31e7
fix: wrong dtype in bias
|
hai 1 ano |
AlpinDale
|
45f6d9f923
initial refactor commit
|
hai 1 ano |
AlpinDale
|
826de3ef93
use flash attention with xformers
|
hai 1 ano |
AlpinDale
|
c687430ce7
bump xformers and clean up leftover code
|
hai 1 ano |
AlpinDale
|
17b15d74c7
fix: scheduler
|
hai 1 ano |