Question

This should be pretty simple, but I am having trouble understanding the basic working of '+' in regex.h library in C. Not sure what is going wrong.

Pasting a sample code which doesn't work. I want to find a string which starts with B and ends with A, there can be more than one occurrence of B so I want to use B+

int main(int argc, const char * argv[])
{
regex_t regex;
int reti;

/* Compile regular expression */
reti = regcomp(&regex, "^B+A$", 0);
if( reti)
{
    printf("Could not compile regex\n");
    exit(1);
}

/* Execute regular expression */
reti = regexec(&regex, "BBBA", 0, NULL, 0);
if (!reti )
{
    printf("Match\n");
}
else if( reti == REG_NOMATCH )
{
    printf("No match\n");
}
else
{
    printf("Regex match failed\n");
    exit(1);
}

/* Free compiled regular expression if you want to use the regex_t again */
regfree(&regex);
return 0;
}

This does not find the match, but I am not able to understand why.

Usage of ^BB*A$ works fine, but that is not something I would want. As I also want to check for something like ^[BCD]+A$ which should match BBBA or CCCCA or DDDDA. Usage of ^[BCD][BCD]*A$ wont work for me as that could match BCCCA which is not the desired match. Tried using parentheses and brackets in the expression but it doesn't seem to help.

Quick help is much appreciated.

Was it helpful?

Solution

By default regcomp() compiles a pattern as a so-called Basic Regular Expression; in such regular expressions the + operator is not available. The regex syntax you're trying to use is known as Extended Regular Expression syntax. In order to have regcomp() work with that more extended syntax you need to pass it the REG_EXTENDED flag.

By the way, this comment:

As I also want to check for something like ^[BCD]+A$ which should match BBBA or CCCCA or DDDDA. Usage of ^[BCD][BCD]*A$ wont work for me as that could match BCCCA which is not the desired match

is based on a misconception of how the quantifiers + and * work. The regular expressions ^[BCD]+A$ and ^[BCD][BCD]*A$ are exactly equivalent.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top