A Myhill-Nerode Theorem for Register Automata and Symbolic Trace Languages

07/07/2020

∙

We propose a new symbolic trace semantics for register automata (extended finite state machines) which records both the sequence of input symbols that occur during a run as well as the constraints on input parameters that are imposed by this run. Our main result is a generalization of the classical Myhill-Nerode theorem to this symbolic setting. Our generalization requires the use of three relations to capture the additional structure of register automata. Location equivalence ≡_l captures that symbolic traces end in the same location, transition equivalence ≡_t captures that they share the same final transition, and a partial equivalence relation ≡_r captures that symbolic values v and v' are stored in the same register after symbolic traces w and w', respectively. A symbolic language is defined to be regular if relations ≡_l, ≡_t and ≡_r exist that satisfy certain conditions, in particular, they all have finite index. We show that the symbolic language associated to a register automaton is regular, and we construct, for each regular symbolic language, a register automaton that accepts this language. Our result provides a foundation for grey-box learning algorithms in settings where the constraints on data parameters can be extracted from code using e.g. tools for symbolic/concolic execution or tainting. We believe that moving to a grey-box setting is essential to overcome the scalability problems of state-of-the-art black-box learning algorithms.

READ FULL TEXT

A Myhill-Nerode Theorem for Register Automata and Symbolic Trace Languages

Sign in with Google

Consider DeepAI Pro