Are there improvements on Dana Angluin's algorithm for learning regular sets

https://cs.stackexchange.com/questions/118

16-10-2019
|

Pergunta

In her 1987 seminal paper Dana Angluin presents a polynomial time algorithm for learning a DFA from membership queries and theory queries (counterexamples to a proposed DFA).

She shows that if you are trying to learn a minimal DFA with $n$ states, and your largest countexample is of length $m$, then you need to make $O(mn^2)$ membership-queries and at most $n - 1$ theory-queries.

Have there been significant improvements on the number of queries needed to learn a regular set?

References and Related Questions

Dana Angluin (1987) "Learning Regular Sets from Queries and Counterexamples", Infortmation and Computation 75: 87-106
Lower bounds for learning in the membership query and counterexample model

Solução

In his answer on cstheory.SE, Lev Reyzin directed me to Robert Schapire's thesis that improves the bound to $O(n^2 + n\log m)$ membership queries in section 5.4.5. The number of counterexample queries remains unchanged. The algorithm Schapire uses differs in what it does after a counterexample query.

Sketch of the improvement

At the highest level, Schapire forces $(S,E,T)$ from Angluin's algorithm to have the extra condition that for a closed $(S,E,T)$ and each $s_1, s_2 \in S$ if $s_1 \neq s_2$ then $row(s_1) \neq row(s_2)$. This guarantees that $|S| \leq n$ and also makes the consistency property of Angluin's algorithm trivial to satisfy. To ensure this, he has to handle the results of a counterexample differently.

Given a counterexample $z$, Angluin simply added $z$ and all its prefixes to $S$. Schapire does something more subtle by instead adding a single element $e$ to $E$. This new $e$ will make $(S,E,T)$ to be not closed in Angluin's sense and the update to get closure with introduce at least one new string to $S$ while keeping all rows distinct. The condition on $e$ is:

$$\exists s, s' \in S, a \in \Sigma \quad \text{s.t} \quad row(s) = row(s'a) \; \text{and} \; o(\delta(q_0,se)) \neq o(\delta(q_0,s'ae))$$

Where $o$ is the output function, $q_0$ is the initial state, and $\delta$ the update rule of the true 'unknown' DFA. In otherwords, $e$ must serve as a witness to distinguish the future of $s$ from $s'a$.

To figure out this $e$ from $z$ we do a binary search to figure out a substring $r_i$ such that $z = p_ir_i$ and $0 \leq |p_i| = i < |z|$ such that the behavior of our conjectured-machine differs based on one input character. In more detail, we let $s_i$ be the string corresponding to the state reached in our conjectured machine by following $p_i$. We use binary search (this is where the $\log m$ comes from) to find an $k$ such that $o(\delta(q_0,s_kr_k)) \neq o(\delta(q_0,s_{k+1}r_{k+1})$. In other words, $r_{k+1}$ distinguishes two states that our conjectured machines finds equivalent and thus satisfies the condition on $e$, so we add it to $E$.

Outras dicas

I do not know if my answer is still relevant. Recently it has been described the implementation of a new algorithm called Observation Pack or in some circumstances Discrimination Tree by Falk Howar. This algorithm is like L* but use Rivest-Shapire or a other method(see Steffen and Isberner) for handle counterexample decomposition; and it uses a data structure, a discrimination tree (a binary tree) for make efficiently a "sift" namely the insertion of a A-transition (where A is each symbol of alphabet) of a new state found until there isn't closedness. This algorithm exists in two versions: OneGlobally and OneLocally according to whether the suffix founded in the decomposition is added to each component or not (the ratio behind the algorithm is that all prefixes in a component are equivalent to a short prefix and represent the same state in the target according to suffixes found at this time. Later with a new counterexample a new suffix is found that discriminates at least 2 prefixes of a same component. This cause a split of that component in two component). With OneLocally there are far fewer membership query but the number of equivalence query can increase drastically with big target DFA. Rather OneGlobally has a number of membership query always lower than L* (but greater than OneLocally) and a similar number of equivalences queries than L*

I know that exists another algorithm too: TTT algorithm that is better than Observation Pack too but I don't have a good knowledge of it.TTT algorithm should be the state of art

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange