Frage

Specifically when does ^ mean "match start" and when does it mean "not the following" in regular expressions?

From the Wikipedia article and other references, I've concluded it means the former at the start and the latter when used with brackets, but how does the program handle the case where the caret is at the start and at a bracket? What does, say, ^[b-d]t$ match?

War es hilfreich?

Lösung

^ only means "not the following" when inside and at the start of [], so [^...].

When it's inside [] but not at the start, it means the actual ^ character.

When it's escaped (\^), it also means the actual ^ character.

In all other cases it means start of the string / line (which one is language / setting dependent).

So in short:

  • [^abc] -> not a, b or c
  • [ab^cd] -> a, b, ^ (character), c or d
  • \^ -> a ^ character
  • Anywhere else -> start of string / line.

So ^[b-d]t$ means:

  • Start of line
  • b/c/d character
  • t character
  • End of line

Andere Tipps

Going to ignore block comments ? Ok, this ^\s* might be bad because \s can span lines. See if Dot-net supports horizontal whitespace \h if not [^\S\r\n] works also. Can use multi-line inline modifier (?m) (or RegexOptions.Multiline). That changes the meaning of ^ to mean the beginning of line as opposed to beginning of string (the default). So, it ends up being (?m)^\h*(#). The capture group should tell the position. If not, this is just as well (?m)(?<=^\h*)# and the position of the match is the offset.

See this for complete regex info https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

Note that ^\s* will work of course, but it matches a lot of unnecessary cruft that can span lines.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top