Question

Following up from an earlier question on extracting the n'th regex match, I now need to substitute the match, if found.

I thought that I could define the extraction subroutine and call it in the substitution with the /e modifier. I was obviously wrong (admittedly, I had an XY problem).

use strict;
use warnings;

sub extract_quoted { # à la codaddict

        my ($string, $index) = @_;
        while($string =~ /'(.*?)'/g) {
                $index--;
                return $1 if(! $index);
        }
        return;
}

my $string = "'How can I','use' 'PERL','to process this' 'line'";

extract_quoted ( $string, 3 );
$string =~ s/&extract_quoted($string,2)/'Perl'/e;

print $string; # Prints 'How can I','use' 'PERL','to process this' 'line'

There are, of course, many other issues with this technique:

  • What if there are identical matches at different positions?
  • What if the match isn't found?

In light of this situation, I'm wondering in what ways this could be implemented.

Was it helpful?

Solution

Or you can do something as this

use strict;
use warnings;

my $string = "'How can I','use' .... 'perl','to process this' 'line'";

my $cont =0;
sub replacen { # auxiliar function: replaces string if incremented counter equals $index
        my ($index,$original,$replacement) = @_;
        $cont++;
        return $cont == $index ? $replacement: $original;
}

#replace the $index n'th match (1-based counting) from $string by $rep
sub replace_quoted {
        my ($string, $index,$replacement) = @_;
        $cont = 0; # initialize match counter
        $string =~ s/'(.*?)'/replacen($index,$1,$replacement)/eg;
        return $string;
}

my $result = replace_quoted ( $string, 3 ,"PERL");
print "RESULT: $result\n";

A little ugly the "global" $cont variable, that could be polished, but you get the idea.

Update: a more compact version:

use strict;
my $string = "'How can I','use' .... 'perl','to process this' 'line'";

#replace the $index n'th match (1-based counting) from $string by $replacement
sub replace_quoted {
        my ($string, $index,$replacement) = @_;
        my $cont = 0; # initialize match counter
        $string =~ s/'(.*?)'/$cont++ == $index ? $replacement : $1/eg;
        return $string;
}

my $result = replace_quoted ( $string, 3 ,"PERL");
print "RESULT: $result\n";

OTHER TIPS

EDIT: leonbloy came up with this solution first. If your tempted to upvote it, upvote leonbloy's first.

Somewhat inspired by leonbloy's (earlier) answer:

$line = "'How can I','use' 'PERL' 'to process this';'line'";
$n = 3;
$replacement = "Perl";

print "Old line: $line\n";
$z = 0;
$line =~ s/'(.*?)'/++$z==$n ? "'$replacement'" : "'$1'"/ge;
print "New line: $line\n";

Old line: 'How can I','use' 'PERL' 'to process this';'line'
New line: 'How can I','use' 'Perl' 'to process this';'line'

If the regex isn't too much more complicated than what you have, you could follow a split with an edit and a join:

$line = "'How can I','use' 'PERL','to process this' 'line'";

$n = 3;
$new_text = "'Perl'";
@f = split /('.*?')/, $line;
# odd fields of @f contain regex matches
# even fields contain the text between matches
$f[2*$n-1] = $new_text;
$new_line = join '', @f;

See perldoc perlvar:

use strict; use warnings;

use Test::More tests => 5;

my %src = (
    q{'I want to' 'extract the word' 'PERL','from this string'}
    => q{'I want to' 'extract the word' 'Perl','from this string'},
    q{'What about', 'getting','PERL','from','here','?'}
    => q{'What about', 'getting','Perl','from','here','?'},
    q{'How can I','use' 'PERL','to process this' 'line'}
    => q{'How can I','use' 'Perl','to process this' 'line'},
    q{Invalid} => q{Invalid},
    q{'Another invalid string'} => q{'Another invalid string'}
);

while ( my ($src, $target) = each %src ) {
    ok($target eq subst_n($src, 3, 'Perl'), $src)
}

sub subst_n {
    my ($src, $index, $replacement) = @_;
    return $src unless $index > 0;
    while ( $src =~ /'.*?'/g ) {
        -- $index or return join(q{'},
            substr($src, 0, $-[0]),
            $replacement,
            substr($src, $+[0])
        );
    }
    return $src;
}

Output:

C:\Temp> pw
1..5
ok 1 - 'Another invalid string'
ok 2 - 'How can I','use' 'PERL','to process this' 'line'
ok 3 - Invalid
ok 4 - 'What about', 'getting','PERL','from','here','?'
ok 5 - 'I want to' 'extract the word' 'PERL','from this string'

Of course, you need to decide what happens if an invalid $index is passed or if the required match is not found. I just return the original string in the code above.

Reworking an answer to an earlier question, match n-1 times and then replace the next. Memoizing patterns spares poor Perl having to recompile the same patterns over and over.

my $_quoted = qr/'[^']+'/; # ' fix Stack Overflow highlighting
my %_cache;
sub replace_nth_quoted { 
  my($string,$index,$replace) = @_;
  my $pat = $_cache{$index} ||=
    qr/ ^
        (                    # $1
          (?:.*?$_quoted.*?) # match quoted substrings...
            {@{[$index-1]}}  # $index-1 times
        )
        $_quoted             # the ${index}th match
      /x;

  $string =~ s/$pat/$1$replace/;
  $string;
}

For example

my $string = "'How can I','use' 'PERL','to process this' 'line'";
print replace_nth_quoted($string, 3, "'Perl'"), "\n";

outputs

'How can I','use' 'Perl','to process this' 'line'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top