Scalable Modular Synthetic Data Generation for Advancing Aerial Autonomy

by   Mehrnaz Sabet, et al.

Harnessing the benefits of drones for urban innovation at scale requires reliable aerial autonomy. One major barrier to advancing aerial autonomy has been collecting large-scale aerial datasets for training machine learning models. Due to costly and time-consuming real-world data collection through deploying drones, there has been an increasing shift towards using synthetic data for training models in drone applications. However, to increase generalizability of trained policies on synthetic data, incorporating domain randomization into the data generation workflow for addressing the sim-to-real problem becomes crucial. Current synthetic data generation tools either lack domain randomization or rely heavily on manual workload or real samples for configuring and generating diverse realistic simulation scenes. These dependencies limit scalability of the data generation workflow. Accordingly, there is a major challenge in balancing generalizability and scalability in synthetic data generation. To address these gaps, we introduce a modular scalable data generation workflow tailored to aerial autonomy applications. To generate realistic configurations of simulation scenes while increasing diversity, we present an adaptive layered domain randomization approach that creates a type-agnostic distribution space for assets over the base map of the environments before pose generation for drone trajectory. We leverage high-level scene structures to automatically place assets in valid configurations and then extend the diversity through obstacle generation and global parameter randomization. We demonstrate the effectiveness of our method in automatically generating diverse configurations and datasets and show its potential for downstream performance optimization. Our work contributes to generating enhanced benchmark datasets for training models that can generalize better to real-world situations.


page 9

page 14

page 15

page 18

page 19


Domain Randomization for Scene-Specific Car Detection and Pose Estimation

We address the issue of domain gap when making use of synthetic data to ...

Photo-realistic Neural Domain Randomization

Synthetic data is a scalable alternative to manual supervision, but it r...

Automatic Generation of Synthetic Colonoscopy Videos for Domain Randomization

An increasing number of colonoscopic guidance and assistance systems rel...

Sim2SG: Sim-to-Real Scene Graph Generation for Transfer Learning

Scene graph (SG) generation has been gaining a lot of traction recently....

STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Although various 3D datasets with different functions and scales have be...

Balanced Face Dataset: Guiding StyleGAN to Generate Labeled Synthetic Face Image Dataset for Underrepresented Group

For a machine learning model to generalize effectively to unseen data wi...

Learning Vine Copula Models For Synthetic Data Generation

A vine copula model is a flexible high-dimensional dependence model whic...

Please sign up or login with your details

Forgot password? Click here to reset