Scipp: Scientific data handling with labeled multi-dimensional arrays for C++ and Python

10/01/2020
by   Simon Heybrock, et al.
0

Scipp is heavily inspired by the Python library xarray. It enriches raw NumPy-like multi-dimensional arrays of data by adding named dimensions and associated coordinates. Multiple arrays are combined into datasets. On top of this, scipp introduces (i) implicit handling of physical units, (ii) implicit propagation of uncertainties, (iii) support for histograms, i.e., bin-edge coordinate axes, which exceed the data's dimension extent by one, and (iv) support for event data. In conjunction these features enable a more natural and more concise user experience. The combination of named dimensions, coordinates, and units helps to drastically reduce the risk for programming errors. The core of scipp is written in C++ to open opportunities for performance improvements that a Python-based solution would not allow for. On top of the C++ core, scipp's Python components provide functionality for plotting and content representations, e.g., for use in Jupyter Notebooks. While none of scipp's concepts in isolation is novel per-se, we are not aware of any project combining all of these aspects in a single coherent software package.

READ FULL TEXT
research
07/04/2022

Optimal Multi-Dimensional Auctions: Conjectures and Simulations

We explore the properties of optimal multi-dimensional auctions in a mod...
research
03/03/2023

The Awkward World of Python and C++

There are undeniable benefits of binding Python and C++ to take advantag...
research
10/19/2022

Sparse arrays in R: the spray package

In this short article I introduce the spray package, which provides some...
research
07/17/2018

Physical-type correctness in scientific Python

The representation of units and dimensions in informatics systems is bar...
research
06/16/2016

D2O - a distributed data object for parallel high-performance computing in Python

We introduce D2O, a Python module for cluster-distributed multi-dimensio...
research
03/03/2023

Using a DSL to read ROOT TTrees faster in Uproot

Uproot reads ROOT TTrees using pure Python. For numerical and (singly) j...
research
07/20/2021

PandaPy: A Wrapper Around Structured Arrays to Mimic ‘Structs’ in the C Language

Similar to the original Pandas project, PandaPy is developed to improve ...

Please sign up or login with your details

Forgot password? Click here to reset