Layer Based Partition for Matrix Multiplication on Heterogeneous Processor Platforms
While many approaches have been proposed to analyze the problem of matrix multiplication parallel computing, few of them address the problem on heterogeneous processor platforms. It still remains an open question on heterogeneous processor platforms to find the optimal schedule that balances the load within the heterogeneous processor set while minimizing the amount of communication. A great many studies are based on rectangular partition, whereas the optimality of rectangular partition as the basis has not been well justified. In this paper, we propose a new method that schedules matrix multiplication on heterogeneous processor platforms with the mixed co-design goal of minimizing the total communication volume and the multiplication completion time. We first present the schema of our layer based partition (LBP) method. Subsequently, we demonstrate that our approach guarantees minimal communication volume, which is smaller than what rectangular partition can reach. We further analyze the problem of minimizing the task completion time, with network topologies taken into account. We solve this problem in both single-neighbor network case and multi-neighbor network case. In single-neighbor network cases, we propose an equality based method to solve LBP, and simulation shows that the total communication volume is reduced by 75 from the lower bound of rectangular partition. In multi-neighbor network cases, we formulate LBP as a Mixed Integer Programming problem, and reduce the total communication volume by 81 promising perspective of tackling matrix multiplication problems on heterogeneous processor platforms.
READ FULL TEXT