Question

In Regular Expressions Quick Start, it reads

Twelve characters have special meanings in regular expressions: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), the opening square bracket [, and the opening curly brace {. These special characters are often called "metacharacters". Most of them are errors when used alone.

In its specification, (), [,{ are metacharacters whereas 'closing sqare bracket' and 'closing curly brace' are not.

Obviously,[ and { unable to take an effect individually just like opening parenthesis ( should partners ).

What's the reason that causes ] and } failing to be selected?

Was it helpful?

Solution

You can write a plain ] in most regex engines and have it match a ] in the input. This works because ] on its own does not have special meaning.

Contrariwise, you can't write a plain [ to match a [ in the input - the regex engine will complain that the brackets are unbalanced. This is probably why this text says that [ is special and ]isn't.

This seems unintuitive, but it's true. ] has a special meaning only within a character class, and the special meaning is "terminate the class". This is just like -, which also has a special meaning only within a character class (the meaning is "create a range between the two neighbouring letters"). You wouldn't take this to mean that - in general is a metacharacter, and for the same reason ] in general isn't a metacharacter.

If this seems weird, you're in good company, but it is the most logical way of looking at it. The reason it looks weird is that we're conditioned to consider [] () {} as matching pairs. (Compare the well-known "paradox" that ())(is a palindrome, but (()) isn't.)

Licensed under: CC-BY-SA with attribution
scroll top