D2O - a distributed data object for parallel high-performance computing in Python

06/16/2016
by   T. Steininger, et al.
0

We introduce D2O, a Python module for cluster-distributed multi-dimensional numerical arrays. It acts as a layer of abstraction between the algorithm code and the data-distribution logic. The main goal is to achieve usability without losing numerical performance and scalability. D2O's global interface is similar to the one of a numpy.ndarray, whereas the cluster node's local data is directly accessible for use in customized high-performance modules. D2O is written in pure Python which makes it portable and easy to use and modify. Expensive operations are carried out by dedicated external libraries like numpy and mpi4py. The performance of D2O is on a par with numpy for serial applications and scales well when moving to an MPI cluster. D2O is open-source software available under the GNU General Public License v3 (GPL-3) at https://gitlab.mpcdf.mpg.de/ift/D2O

READ FULL TEXT
research
10/08/2019

TorchBeast: A PyTorch Platform for Distributed RL

TorchBeast is a platform for reinforcement learning (RL) research in PyT...
research
11/09/2019

Performance Comparison of MPICH and MPI4py on Raspberry Pi-3B Beowulf Cluster

Moore's Law is running out. Instead of making powerful computer by incre...
research
04/22/2021

PyArmadillo: a streamlined linear algebra library for Python

PyArmadillo is a linear algebra library for the Python language, with th...
research
02/11/2016

High performance Python for direct numerical simulations of turbulent flows

Direct Numerical Simulations (DNS) of the Navier Stokes equations is an ...
research
04/20/2021

ds-array: A Distributed Data Structure for Large Scale Machine Learning

Machine learning has proved to be a useful tool for extracting knowledge...
research
10/01/2020

Scipp: Scientific data handling with labeled multi-dimensional arrays for C++ and Python

Scipp is heavily inspired by the Python library xarray. It enriches raw ...
research
10/02/2020

DIETERpy: a Python framework for The Dispatch and Investment Evaluation Tool with Endogenous Renewables

DIETER is an open-source power sector model designed to analyze future s...

Please sign up or login with your details

Forgot password? Click here to reset