Pergunta

I am learning grep atm but I am having difficulty understanding the working of the kleene star metacharacter. The man pages describe that the * matches previous character zero or more times. I am using a file named test with the following content

*a
123ab
1234
abcdef
a?
?

grep 'a*' test should match zero or more occurrences of a and as explained prints every line of the test file in the output. The document further describes that to match metacharacters like * they have to be escaped by preceding them with a backslash \. But the output from grep '*' test and grep '\*' test is same. Output: *a Why is * matching itself without preceding it with \?

Foi útil?

Solução

* on its own is an invalid regular expression since there is no previous item to repeat. Your implementation of grep, in this case, interprets it as a literal *. \* is a valid regular expression which matches a *. Your implementation's interpretation of the invalid regular expression * and the valid regular expression \* just happen to be the same.

If you really want to see the difference between * and \*, you should try it on a valid regular expression by adding an item before it. For example, a literal a:

grep 'a*'
grep 'a\*'

The former will match anything since * can match zero characters successfully. The latter will only match lines containing a* literally.

Outras dicas

A leading * in a regular expression is valid according to section 9.3.3 of SUSv3. Naruto, your platform's regular expression interpreter is doing the right thing here: * is not a special character when it is at the start of a regular expression.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top