(I'm ignoring the fact that you need a multiple-line match; how to do that differs from platform and regex implementation (have a look at sed
!)).
The .*
directly after the struct
word matches anything, including { and }. Thus, the string struct s_one {bool a:3;} one; struct s_two {bool b:4} two;
will be just one match. Worse,
struct one {
int noBits;
};
int main(void)
{
return (2>1)?1:0;
}
will match, which is not what you want (note the colon within body of the main function. So you should, instead, look for a match having only something valid between the struct
and opening brace. Try, for example:
struct\s+[a-zA-Z0-9_]+\s*{ [^}]*:[^}]*}
which, in normal English, would translate to: "Search for the word struct
, followed by one or more whitespaces, followed by a valid identifier name consisting of only the given characters (one or more of them), optionally followed by any number of whitespaces, followed by a curly opening brace (we're now inside the definition of the struct), followed by any text except for a curly closing brace (we don't want to leave the definition), having a colon somewhere, followed again by any text except for a curly closing brace, followed by the closing brace.
Note that, depending on your parser, you may need to escape the curly braces (they have a special meainig in Regex). Note also that a simpler regex may also suffice (e.g., you could remove anything behind the colon and it would still work), but what I wrote down gives a better idea of how to construct such a regex in general. Also note that this regex does not take into account any forms of comments inside the code (e.g., it does not match
struct one // my favorite first struct
{
bool a:8;
};
(because one // my favorite first struct
does not match the 'valid identifier name' code [a-zA-Z0-9_]
).