سؤال

I am trying to write a selector which filters out/deletes everything after a comment (#) symbol, but ignores # symbols within strings ("" or '')

So in the below example only the last line would be filtered/deleted

"#dummy comment" asd asd
' abc # dummy comment'
abc #real comment

These are the two regex selectors I have come up with so far(i will add the s/ tag later on):

/('|").*#.*('|")/g        --> selects fake commnts

/#(?!!).+/g   ---> highlights all comments, including above ones (?!! is to ignore #!/usr/bin/env perl ) 

I'm currently struggling to join these two comments to achieve the above result. Ive tried using forward and reverse lookups but cant seem to get this right.Any advice at all would be greatly appreciated.

هل كانت مفيدة؟

المحلول

What about this simple method?

  1. remove all substrings between double or single quotes first
  2. extract out substring after a # by a simple regular expression cpature

Here is the test program for your sample input:

#!/usr/bin/perl

use strict;
use warnings;

while (<DATA>) {
    chomp;
    s/['"][^'"]*['"]//g;
    if (m/#(.*)/) {
        print "$1\n";
    }
}

__DATA__
"#dummy comment" asd asd
' abc # dummy comment'
abc #real comment
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top