AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
il y a 8 mois |
AlpinDale
|
78d66f16d1
Chunked Prefill Part 1 (#384)
|
il y a 9 mois |
AlpinDale
|
9181fa0396
feat: Triton kernels for sampling (#383)
|
il y a 9 mois |
AlpinDale
|
4b99ac15b7
fix: do not deepcopy metadata
|
il y a 9 mois |
AlpinDale
|
17b034613d
chore: make metadata a dataclass (#377)
|
il y a 9 mois |
AlpinDale
|
f8dfac6372
chore: attention refactor and upstream sync apr01 (#365)
|
il y a 9 mois |
50h100a
|
b9e0ae87c5
fix fine-grained seeding.
|
il y a 10 mois |
AlpinDale
|
e42a78381a
feat: switch from pylint to ruff (#322)
|
il y a 10 mois |
sgsdxzy
|
50c0875c32
chore: log total memory usage (#316)
|
il y a 10 mois |
AlpinDale
|
c2d77b1822
chore: logging refactor (#302)
|
il y a 10 mois |
AlpinDale
|
9810daa699
feat: INT8 KV Cache (#298)
|
il y a 10 mois |
AlpinDale
|
ac82b67f75
feat: naive context shift and various QoL changes (#289)
|
il y a 10 mois |
AlpinDale
|
4d04ade9ef
feat: fine-grained seeds (#279)
|
il y a 11 mois |
AlpinDale
|
697c06c4f5
fix: LoRA support for mixtral (#276)
|
il y a 11 mois |
AlpinDale
|
4b80b42362
fix: memory leaks due to nccl cuda graphs (#275)
|
il y a 11 mois |
AlpinDale
|
ea0f57b233
feat: allow further support for non-cuda devices (#247)
|
il y a 11 mois |
AlpinDale
|
1a94ccf3cf
fix: prefix cache fail with lora (#239)
|
il y a 11 mois |
AlpinDale
|
c3a221eb02
feat: GGUF, QuIP#, and Marlin support (#228)
|
il y a 1 an |
AlpinDale
|
31c95011a6
feat: FP8 E5M2 KV Cache (#226)
|
il y a 1 an |
AlpinDale
|
641bb0f6e9
feat: add custom allreduce kernels (#224)
|
il y a 1 an |
AlpinDale
|
c0aac15421
feat: S-LoRA support (#222)
|
il y a 1 an |
AlpinDale
|
8fa608aeb7
feat: replace Ray with NCCL for control plane comms (#221)
|
il y a 1 an |
AlpinDale
|
d54791aaa8
feat: reduce sampler overhead by making it less blocking (#198)
|
il y a 1 an |
AlpinDale
|
7d91e9e0f2
feat: CUDA graphs (#172)
|
il y a 1 an |
g4rg
|
2aab3da9bd
chore: fix Python 3.8+ compatibility (#170)
|
il y a 1 an |
AlpinDale
|
ae57df0f44
fix: sliding window for mistral/mixtral (#163)
|
il y a 1 an |
AlpinDale
|
653da510d1
chore: rewrite InputMetadata (#143)
|
il y a 1 an |