Pattern matching language knowledge, pattern matching approach

https://stackoverflow.com/questions/21416002

04-10-2022
|

Pregunta

I am trying to implement a pattern matching "syntax" and language. I know of regular expressions but these aren't enough for my scopes. I have individuated some "mathematical" operators. In the examples that follow I will suppose that the subject of pattern mathing are character strings but it isn't necessary.

Having read the description bellow: The question is, does any body knows of a mathematical theory explicitating that or any language that takes the same approach implementing it ? I would like to look at it in order to have ideas !

Descprition of approach:

At first we have characters. Characters may be aggregated to form strings.

A pattern is: a) a single character b) an ordered group of patterns with the operator matchAny c) an ordered group of patterns with the operator matchAll d) other various operators to see later on.

Explanation: We have a subject character string and a starting position.

If we check for a match of a single character, then if it matches it moves the current position forward by one position.

If we check for a match of an ordered group of patterns with the operator matchAny then it will check each element of the group in sequence and we will have a proliferation of starting positions that will get multiplied by the number of possible matches being advanced by the length of the match.

E.G suppose the group of patterns is { "a" "aba" "ab" "x" "dd" } and the string under examination is: "Dabaxddc" with current position 2 ( counting from 1 ). Then applying matchAny with the previous group we have that "a" mathces "aba" matches and "ab" matches while "x" and "dd" do not match. After having those matches there are 3 starting positions 3 4 5 ( corresponding to "a" "ab" "aba" ).

We may continue our pattern matching by accepting to have more then one starting positions. So now we may continue to the next case under examination and check for a matchAll. matchAll means that all patterns must match sequentially and are applied sequentially. subcases of matchAll are match0+ match1+ etc.

I have to add that the same fact to try to ask the question has already helped me and cleared me out some things. But I would like to know of similar approaches in order to study them. Please only languages used by you and not bibliography !!!

No hay solución correcta

Otros consejos

I suggest you have a look at the paper "Parsing Permutation Phrases". It deals with recognizing a set of things in any order where the "things" can be recognizers themselves. The presentation in the paper might be a little different than what you expect; they don't compile to finite automaton. However, they do give an implementation in a functional language and that should be helpful to you.

Your description of matching strings against patterns is exactly what a compiler does. In particular, your description of multiple potential matches is highly reminiscent of the way an LR parser works.

If the patterns are static and can be described by an EBNF, then you could use an LR parser generator (such as YACC) to generate a recogniser.

If the patterns are dynamic but can still be formulated as EBNF there are other tools that can be applied. It just gets a bit more complicated.

[In Australia at least, Computer Science was a University course in 1975, when I did mine. YACC dates from around 1970 in its original form. EBNF is even older.]

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow