FOREST: An Interactive Multi-tree Synthesizer for Regular Expressions

12/28/2020
by   Margarida Ferreira, et al.
0

Form validators based on regular expressions are often used on digital forms to prevent users from inserting data in the wrong format. However, writing these validators can pose a challenge to some users. We present FOREST, a regular expression synthesizer for digital form validations. FOREST produces a regular expression that matches the desired pattern for the input values and a set of conditions over capturing groups that ensure the validity of integer values in the input. Our synthesis procedure is based on enumerative search and uses a Satisfiability Modulo Theories (SMT) solver to explore and prune the search space. We propose a novel representation for regular expressions synthesis, multi-tree, which induces patterns in the examples and uses them to split the problem through a divide-and-conquer approach. We also present a new SMT encoding to synthesize capture conditions for a given regular expression. To increase confidence in the synthesized regular expression, we implement user interaction based on distinguishing inputs. We evaluated FOREST on real-world form-validation instances using regular expressions. Experimental results show that FOREST successfully returns the desired regular expression in 72 instances and outperforms REGEL, a state-of-the-art regular expression synthesizer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2021

ReGiS: Regular Expression Simplification via Rewrite-Guided Synthesis

Expression simplification is an important task necessary in a variety of...
research
10/23/2020

Automatic Repair of Vulnerable Regular Expressions

A regular expression is called vulnerable if there exist input strings o...
research
08/16/2019

Sketch-Driven Regular Expression Generation from Natural Language and Examples

Recent systems for converting natural language descriptions into regular...
research
08/24/2017

A Computational Interpretation of Context-Free Expressions

We phrase parsing with context-free expressions as a type inhabitation p...
research
05/29/2023

Search-Based Regular Expression Inference on a GPU

Regular expression inference (REI) is a supervised machine learning and ...
research
05/17/2023

Data Extraction via Semantic Regular Expression Synthesis

Many data extraction tasks of practical relevance require not only synta...
research
06/14/2022

Learning from Uncurated Regular Expressions

Significant work has been done on learning regular expressions from a se...

Please sign up or login with your details

Forgot password? Click here to reset