The Health Gym: Synthetic Health-Related Datasets for the Development of Reinforcement Learning Algorithms

by   Nicholas I-Hsien Kuo, et al.

In recent years, the machine learning research community has benefited tremendously from the availability of openly accessible benchmark datasets. Clinical data are usually not openly available due to their highly confidential nature. This has hampered the development of reproducible and generalisable machine learning applications in health care. Here we introduce the Health Gym - a growing collection of highly realistic synthetic medical datasets that can be freely accessed to prototype, evaluate, and compare machine learning algorithms, with a specific focus on reinforcement learning. The three synthetic datasets described in this paper present patient cohorts with acute hypotension and sepsis in the intensive care unit, and people with human immunodeficiency virus (HIV) receiving antiretroviral therapy in ambulatory care. The datasets were created using a novel generative adversarial network (GAN). The distributions of variables, and correlations between variables and trends over time in the synthetic datasets mirror those in the real datasets. Furthermore, the risk of sensitive information disclosure associated with the public distribution of the synthetic datasets is estimated to be very low.


Generative Adversarial Networks Applied to Observational Health Data

Having been collected for its primary purpose in patient care, Observati...

Synthetic Health-related Longitudinal Data with Mixed-type Variables Generated using Diffusion Models

This paper presents a novel approach to simulating electronic health rec...

Synthetic Dataset Generation of Driver Telematics

This article describes techniques employed in the production of a synthe...

Synbols: Probing Learning Algorithms with Synthetic Datasets

Progress in the field of machine learning has been fueled by the introdu...

Ward2ICU: A Vital Signs Dataset of Inpatients from the General Ward

We present a proxy dataset of vital signs with class labels indicating p...

ricu: R's Interface to Intensive Care Data

Providing computational infrastructure for handling diverse intensive ca...

Please sign up or login with your details

Forgot password? Click here to reset