One Group in Regex: Does it make sense?

https://stackoverflow.com/questions/19982577

30-07-2022
|

Domanda

It works, but is there way how to remove the group "word" and still get same Matches?

string targetString = "5782\tabdikace\t101\r\n5705\tAbdul\t178\r\n5293\tabeceda\t590\r\n5769\tabecední\t114\r\n5651\tÁbel\t232\r\n5750\tÁber\t133\r\n5757\tAbcházie\t126\r\n5624\tAbigail\t259"

var matches = Regex.Matches(targetString, "[0-9]+\t(?<word>[^\t]+)\t[0-9]+");
foreach (Match w in matches)
{
    wordsList.Add(w.Groups["word"].ToString());
}

Soluzione

You can do this with positive lookbehind and lookaheads. These check for the existence of text matching a pattern before or after a point, without including and consuming that text in the match.

The equivalent to your expression would be

(?<=[0-9]+\t)[^\t]+(?=\t[0-9]+)

Note that this does not necessarily give the same results as your original expression. Look at the following:

Input string                       0\t one \t1\t two \t2\t three \t3
Groups in original version         11111111111         2222222222222
Groups in new version              ...11111...         ...3333333...
. = checked but not consumed                 ...22222...

Observe how, since the loohahead and lookbehind groups do not consume/match the 1 and 2, only check they are there, they allow the value " two " to be matched, where your original expression did not. Whether you want this or not is up to you.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow