CUBES: A Parallel Synthesizer for SQL Using Examples

by   Ricardo Brancas, et al.

In recent years, more and more people see their work depend on data manipulation tasks. However, many of these users do not have the background in programming required to write complex programs, particularly SQL queries. One way of helping these users is to automatically synthesize the SQL query given a small set of examples provided by the user - a task known as Query Reverse Engineering. In the last decade, a large plethora of program synthesizers for SQL have been proposed, but none of the current tools take advantage of the increased number of cores per processor. This paper proposes CUBES, a parallel program synthesizer for the domain of SQL queries using input-output examples. CUBES extends current sequential query synthesizers with new pruning techniques and a divide-and-conquer approach, splitting the search space into smaller independent sub-problems. Examples are an under-specification, and the synthesized query may not match the user's intent. We improve the accuracy of CUBES by developing a disambiguation procedure based on fuzzing that interacts with the user and increases our confidence that the returned query matches the user intent. We perform an extensive evaluation on around 4000 SQL queries from different domains. Experimental results show that our sequential version can solve more instances than other state-of-the-art SQL synthesizers. Moreover, the parallel approach can scale up to 16 processes with super-linear speedups for many hard instances. Our disambiguation approach is critical to achieving an accuracy of around 75


Example-Driven User Intent Discovery: Empowering Users to Cross the SQL Barrier Through Query by Example

Traditional data systems require specialized technical skills where user...

Synthesizing Analytical SQL Queries from Computation Demonstration

Analytical SQL is widely used in modern database applications and data a...

Applying Constraint Logic Programming to SQL Semantic Analysis

This paper proposes the use of Constraint Logic Programming (CLP) to mod...

You Say 'What', I Hear 'Where' and 'Why' --- (Mis-)Interpreting SQL to Derive Fine-Grained Provenance

SQL declaratively specifies what (not how) the desired output of a query...

Duoquest: A Dual-Specification System for Expressive SQL Queries

Querying a relational database is difficult because it requires users to...

Edit Based Grading of SQL Queries

Grading student SQL queries manually is a tedious and error-prone proces...

QueryVis: Logic-based diagrams help users understand complicated SQL queries faster

Understanding the meaning of existing SQL queries is critical for code m...

Please sign up or login with your details

Forgot password? Click here to reset