AlpinDale
|
f91991f584
fix: f-string fixes
|
há 5 meses atrás |
AlpinDale
|
dba22e4f83
fix: add zeromq fallback for broadcasting large objects (e.g. vlm images)
|
há 5 meses atrás |
AlpinDale
|
7d79c0e726
chore: use nvml query to avoid accidental cuda initialization
|
há 5 meses atrás |
AlpinDale
|
a89c9a0e92
fix: device ordinal issues with world_size and stuff
|
há 5 meses atrás |
AlpinDale
|
34b41e0a87
chore: add coordinator to reduce code duplication in tp and pp
|
há 5 meses atrás |
AlpinDale
|
270bd333af
chore: check if process is on the same node
|
há 5 meses atrás |
AlpinDale
|
b2fd915c35
improve p2p access check
|
há 5 meses atrás |
AlpinDale
|
b984fe4a91
refactor custom allreduce to support multiple tp groups
|
há 6 meses atrás |
AlpinDale
|
47a5c5c00c
don't check the full nvlink connectivity
|
há 6 meses atrás |
AlpinDale
|
096d9eb6c5
enhance nvlink detection
|
há 8 meses atrás |
AlpinDale
|
9d81716bfd
[v0.5.3] Release Candidate (#388)
|
há 8 meses atrás |