Question

I have to extract single-line comments from qmake project file. Rules are simple: comment begins with # symbol and begin with line-break \n. So i'm read some documentation about QRegExp, and write such code to print all comments in qmake file:

QRegExp re ("#(.*)\n$");
re.setMinimal (true);
int comment_index = 0;
while ((comment_index = _project_contents.indexOf (comment_expr, comment_index)) != -1)
{
    QString comment_text = comment_expr.cap (0);
    qDebug() << "Comment 1" << comment_text;
}

But it is not work correctly - just all contents of project file has been printed. Where is my mistake? as i understand from docs, this should work, but it doesn't.

P.S. I'm a newbie in regexes, so please don't beat me :)

Was it helpful?

Solution

The problem is that . "matches any character (including newline).". And the $ is the end of the string.

You could try using not-newline - [^\n] and changing the $ to (\n|$) (newline or end of string):

"#[^\n]*(\n|$)"

But then this matches # anywhere instead of just at the start of a line, so let's try this:

"(^|\n)#[^\n]*(\n|$)"

^ is the start of the string, so basically (^|\n) (start of string or new line) is just before the start of a line.

Can you see a problem there? What if you have 2 comments in 2 consecutive lines? You'll only match the first, since the new-line will be consumed during matching the first (since the next match starts where the previous one finished).

A work-around for this is using look-ahead:

"(^|\n)#[^\n]*(?=\n|$)"

This causes the end newline to not be included in the match (but it is still checked), thus the position will be just before the new-line and the next match can use it.

Can the # be preceded by spaces? If so, check for zero or more spaces (\s*):

"(^|\n)\s*#[^\n]*(?=\n|$)"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top