Influence of atomic FAA on ParallelFor and a cost model for improvements

11/26/2021

∙

This paper focuses on one of the most frequently visited multithreading library interfaces - ParallelFor. In this study, it is inferred that ParallelFor's end-to-end latency performance is noticeably affected by the frequency with which fetch-add-add (FAA) is called during program execution. This can be explained by ParallelFor's uniform semantics and the utilization of atomic FAA. To prove this assumption, a battery of tests was designed and conducted on diverse platforms. From the collected performance statistics and overall trends, several conclusions were drawn and a cost model is proposed to enhance performance by mitigating the influence of FAA.

READ FULL TEXT

Influence of atomic FAA on ParallelFor and a cost model for improvements

Sign in with Google

Consider DeepAI Pro