Question

How would you include a pattern between two characters in a regular expression?

Say i wanted to print everything in the below text apart from words in between " "

This is an "example".

This "is" "an" example.

"This" is an example.

This is what I've tried so far, but i think I'm missing something:

m/(?!"(.*)").*/g
Was it helpful?

Solution

$s = 'This "is" "an" example';
@words = ($s =~ /"([^"]*)"/g);

@words contains all words between " "

OTHER TIPS

You could use s/// to remove those substrings between double quotes.

Here is a test program:

#!/usr/bin/perl

use strict;
use warnings;

use feature qw(switch say);

use Data::Dumper;

while (<DATA>) {
    chomp;
    s/"[^"]*"//g;
    print "$_\n";
}

__DATA__
This is an "example".
This "is" "an" example.
"This" is an example.

Result:

$ perl t.pl
This is an .
This   example.
 is an example.

Similar to redraiment's solution:

@words_in_quotes = ($s =~ /"(.*?)"/g)

No need for look-behind assertions.

This is almost the definition of an XY Problem

Assertions are sort of an advanced feature of regular expressions, and most likely not going to be needed for the majority of problems you'd have to solve.

Instead, I'd focus on the basics, probably starting with greedy versus non-greedy matching.

@quoted_words = ($s =~ /"(.*?)"/g);

Anytime, you use a quantifier * or +, it will attempt to match as many as possible and then work it's way back. You limit this by either reducing the types of characters it should match and adding boundary conditions, or by changing the matching to be non-greedy by adding a question mark. *? or +?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top