Question

OK, I know it's a recurring issue, however I can't seem to find one single working solution based solely on regex.

So, this is what I've come up with (actually basing it on the 'literal' description of multiline comments in the C# sharp grammar specification by ECMA).

\/\*(([^\*])+)|([\*]+(?!\/))[\*]+\/

However, as you can see it's not working...

Demo :

http://regexr.com?38gom

Any ideas? Is this even possible without doing all sorts of hacks? (Well, I mean other than the regex itself... lol)


P.S. If it is of any informational value, I'm currently developing a lexer/parser/interpreter using Lex/Bison/C/D and obviously multiline comments is a thing to be considered...

Was it helpful?

Solution

Here is the working regex for your provided sample from the regexr.com

\/\*+((([^\*])+)|([\*]+(?!\/)))[*]+\/

or:

\/\*.*?\*\/

OTHER TIPS

In case you need this for flex, which doesn't implement non-greedy matches, here is one way of writing the regex:

[/][*][^*]*[*]+([^/*][^*]*[*]+)*[/]

Alternative, not much easier on the eyes:

"/*"[^*]*"*"+([^/*][^*]*"*"+)*"/"

The / doesn't need to be quoted. But the stars do, and it seems more consistent. Yet another option is to quote the stars with backslashes, but I find that even harder to read.


If you did need this for flex/lex, you would have been better off putting an appropriate tag, such as .

In C# I get the best performance with @"(?s:/\*((?!\*/).)*\*/)".

If you like to match all comments (including line comments) use @"(?>/(/[^\r\n]*|(?s:\*((?!\*/).)*\*/)))".

Here is the short answer

\/\*(.*?|\s)*\*\/

It'll start from /*, then .* reads char one by one and \s reads white spaces (including new lines) and finally ends at */. This is working for me using C#.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top