FAIR Data Pipeline: provenance-driven data management for traceable scientific workflows

10/14/2021
by   Sonia Natalie Mitchell, et al.
0

Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of "following the science" are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline developed during the COVID-19 pandemic that allows easy annotation of data as they are consumed by analyses, while tracing the provenance of scientific outputs back through the analytical source code to data sources. Such a tool provides a mechanism for the public, and fellow scientists, to better assess the trust that should be placed in scientific evidence, while allowing scientists to support policy-makers in openly justifying their decisions. We believe that tools such as this should be promoted for use across all areas of policy-facing research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/27/2020

From climate change to pandemics: decision science can help scientists have impact

Scientific knowledge and advances are a cornerstone of modern society. T...
research
05/17/2020

Enhancing Covid-19 Decision-Making by Creating an Assurance Case for Simulation Models

Simulation models have been informing the COVID-19 policy-making process...
research
09/19/2022

Mapping Climate Change Research via Open Repositories AI: advantages and limitations for an evidence-based R D policy-making

In the last few years, several initiatives have been starting to offer a...
research
06/03/2023

brainlife.io: A decentralized and open source cloud platform to support neuroscience research

Neuroscience research has expanded dramatically over the past 30 years b...
research
06/24/2020

Data-driven Analytical Models of COVID-2019 for Epidemic Prediction, Clinical Diagnosis, Policy Effectiveness and Contact Tracing: A Survey

The widely spread CoronaVirus Disease (COVID)-19 is one of the worst inf...
research
02/28/2022

An Algebraic Framework for Structured Epidemic Modeling

Pandemic management requires that scientists rapidly formulate and analy...
research
08/06/2021

On the role of data, statistics and decisions in a pandemic

A pandemic poses particular challenges to decision-making with regard to...

Please sign up or login with your details

Forgot password? Click here to reset