I have a sentence where between the start and end point of the sentence it could include any special character or digit or letter but not a word.

To be more clear on my point I have illustrated below with an example:

I have a sentence like "Today's Market value 0.5 percent"

Now from above sentence in between "Market value" and "percent", I must not get any other word.

Statements allowed:
1) "Today's Market value*    0.5 percent"
2) "Today's Market value\1   0.5 percent"
3) "Today's Market value \1 0.5 percent"
4) "Today's Market value e   0.5 percent"
5) "Today's Market value 0.5 percent"

Statements not allowed:
1) "Today's market value is    0.5 percent"
2) "Today's market value  is 0.5 percent"

3) "Today's Market value is 0.5 percent"

And I am mainly interested in picking up the market value i.e "0.5" here.

Kindly suggest me a proper way to to build a regex to accomplish my above requirement.

有帮助吗?

解决方案

Here is the code to extract the number of interest if the string is ok:

string[] strList = new[] {
    @"Today's Market value*    0.5 percent",
    @"Today's Market value\1   0.5 percent",
    @"Today's Market value \1 0.5 percent",
    @"Today's Market value e   0.5 percent",
    @"Today's Market value 0.5 percent",
    @"Today's market value is    0.5 percent",
    @"Today's market value  is 0.5 percent",
    @"Today's Market value is 0.5 percent"
};
foreach (string str in strList)
{
    Match m = Regex.Match(str, @"(?<=Market value.*\s)(?<!Market value.*[a-zA-Z]{2}.*)\d+(\.\d+)?(?=\s.*percent)(?!.*[a-zA-Z]{2}.*percent)", RegexOptions.Singleline);
    if (m.Success)
        Console.WriteLine("{0} : {1}", m.Value, str);
}

Output:

0.5 : Today's Market value*    0.5 percent
0.5 : Today's Market value\1   0.5 percent
0.5 : Today's Market value \1 0.5 percent
0.5 : Today's Market value e   0.5 percent
0.5 : Today's Market value 0.5 percent

Basic idea: the number should be preceded by Market value text, anything and whitespace, but shouldn't be preceded by Market value + 2 or more sequential letters anywhere text. Also the number should be followed by whitespace, anything and percent text, but shouldn't be followed by 2 or more sequential letters anywhere + percent.

其他提示

Try this regex:

\bMarket value\b(?!\s+is\s)[\s\S]*?(\d+(?:\.\d+)?)\s*percent\b

(?!\s+is\s) is negative lookahead, checking that there is no is after the Market value.

Online Demo

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top