Domanda

How would you include a pattern between two characters in a regular expression?

Say i wanted to print everything in the below text apart from words in between " "

This is an "example".

This "is" "an" example.

"This" is an example.

This is what I've tried so far, but i think I'm missing something:

m/(?!"(.*)").*/g
È stato utile?

Soluzione

$s = 'This "is" "an" example';
@words = ($s =~ /"([^"]*)"/g);

@words contains all words between " "

Altri suggerimenti

You could use s/// to remove those substrings between double quotes.

Here is a test program:

#!/usr/bin/perl

use strict;
use warnings;

use feature qw(switch say);

use Data::Dumper;

while (<DATA>) {
    chomp;
    s/"[^"]*"//g;
    print "$_\n";
}

__DATA__
This is an "example".
This "is" "an" example.
"This" is an example.

Result:

$ perl t.pl
This is an .
This   example.
 is an example.

Similar to redraiment's solution:

@words_in_quotes = ($s =~ /"(.*?)"/g)

No need for look-behind assertions.

This is almost the definition of an XY Problem

Assertions are sort of an advanced feature of regular expressions, and most likely not going to be needed for the majority of problems you'd have to solve.

Instead, I'd focus on the basics, probably starting with greedy versus non-greedy matching.

@quoted_words = ($s =~ /"(.*?)"/g);

Anytime, you use a quantifier * or +, it will attempt to match as many as possible and then work it's way back. You limit this by either reducing the types of characters it should match and adding boundary conditions, or by changing the matching to be non-greedy by adding a question mark. *? or +?

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top