Question

I need to search all directories in a C++ code base for structures that contain bit fields. I know this can be accomplished with regular expressions, but have been unable to put together the correct expression to accomplish this. Basically I need to find any occurrence of struct*{:} where "*" is any character. Thanks in advance for any suggestions.

Was it helpful?

Solution

(I'm ignoring the fact that you need a multiple-line match; how to do that differs from platform and regex implementation (have a look at sed!)).

The .* directly after the struct word matches anything, including { and }. Thus, the string struct s_one {bool a:3;} one; struct s_two {bool b:4} two; will be just one match. Worse, struct one { int noBits; };

int main(void)
{
  return (2>1)?1:0;
}

will match, which is not what you want (note the colon within body of the main function. So you should, instead, look for a match having only something valid between the struct and opening brace. Try, for example:

struct\s+[a-zA-Z0-9_]+\s*{ [^}]*:[^}]*}

which, in normal English, would translate to: "Search for the word struct, followed by one or more whitespaces, followed by a valid identifier name consisting of only the given characters (one or more of them), optionally followed by any number of whitespaces, followed by a curly opening brace (we're now inside the definition of the struct), followed by any text except for a curly closing brace (we don't want to leave the definition), having a colon somewhere, followed again by any text except for a curly closing brace, followed by the closing brace.

Note that, depending on your parser, you may need to escape the curly braces (they have a special meainig in Regex). Note also that a simpler regex may also suffice (e.g., you could remove anything behind the colon and it would still work), but what I wrote down gives a better idea of how to construct such a regex in general. Also note that this regex does not take into account any forms of comments inside the code (e.g., it does not match

struct one // my favorite first struct
{
  bool a:8;
};

(because one // my favorite first struct does not match the 'valid identifier name' code [a-zA-Z0-9_]).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top