AlpinDale
|
34b41e0a87
chore: add coordinator to reduce code duplication in tp and pp
|
hai 5 meses |
AlpinDale
|
5b0c11d190
support pipeline parallel pynccl groups
|
hai 6 meses |
AlpinDale
|
c58589318f
remove the graph mode func
|
hai 6 meses |
AlpinDale
|
236be273e5
feat: tensor parallel speculative decoding (#554)
|
hai 6 meses |
AlpinDale
|
b984fe4a91
refactor custom allreduce to support multiple tp groups
|
hai 6 meses |
AlpinDale
|
8ae2cce237
refactor pynccl
|
hai 6 meses |
AlpinDale
|
92cb5b42d9
support both cpu and device tensor in broadcast tensor dict
|
hai 6 meses |
AlpinDale
|
21ce19b3ea
blocks_to_copy dict -> torch.Tensor
|
hai 6 meses |
AlpinDale
|
1879e32510
enable all-reduce for multiple tp groups
|
hai 6 meses |
AlpinDale
|
ac5b4b6aa7
broadcast metadata through cpu
|
hai 6 meses |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
hai 8 meses |