On Optimal Trees for Irregular Gather and Scatter Collectives
This paper studies the complexity of finding cost-optimal communication trees for rooted, irregular gather and scatter collective communication operations in fully connected, one-ported communication networks under a linear, but not necessarily homogeneous transmission cost model. In the irregular gather and scatter problems, different processors may specify data blocks of possibly different sizes. Processors are numbered consecutively, and data blocks shall be collected or distributed from some (given) root processor. Data blocks from and to processors can be combined into larger segments consisting of multiple blocks; but individual data blocks cannot be split. We distinguish between ordered and unordered problems and algorithms. In an ordered, irregular gather tree algorithm all non-leaf processors receive segments of data blocks for consecutively numbered ranges of processors. An unordered tree algorithm permits received segments to consist of blocks for processors in any order. We show that the ordered problems can be solved in polynomial time, and give simple dynamic programming algorithms to construct optimal communication trees for both gather and scatter problems. In contrast, we show that the unordered problems are NP-complete. We have implemented the dynamic programming algorithms to experimentally evaluate the quality of a recent, simple, distributed algorithm which constructs close to optimal trees. Our experiments show that it is indeed very close to the optimum for a selection of data block distributions, and likely sufficient for all practical purposes.
READ FULL TEXT