AutoDS: Towards Human-Centered Automation of Data Science

01/13/2021
by   Dakuo Wang, et al.
0

Data science (DS) projects often follow a lifecycle that consists of laborious tasks for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces AutoDS, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2021

Automating Data Science: Prospects and Challenges

Given the complexity of typical data science projects and the associated...
research
11/29/2018

Prediction Factory: automated development and collaborative evaluation of predictive models

In this paper, we present a data science automation system called Predic...
research
01/07/2021

How Much Automation Does a Data Scientist Want?

Data science and machine learning (DS/ML) are at the heart of the recent...
research
02/19/2023

AutoDOViz: Human-Centered Automation for Decision Optimization

We present AutoDOViz, an interactive user interface for automated decisi...
research
01/12/2021

Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop

AutoML systems can speed up routine data science work and make machine l...
research
01/07/2020

Vamsa: Tracking Provenance in Data Science Scripts

Machine learning (ML) which was initially adopted for search ranking and...
research
08/03/2023

Automated Machine Learning in the smart construction era:Significance and accessibility for industrial classification and regression tasks

This paper explores the application of automated machine learning (AutoM...

Please sign up or login with your details

Forgot password? Click here to reset