Ignore cases for certain match in Perl

https://stackoverflow.com/questions/11838261

25-06-2021
|

Domanda

Question 1

I want to match pattern dc_abc and replace with dc_ABC, but if the pattern is .dc_abc or "dc_abc", it should remain the same.

Input file:

.dc_abc (dc_abc);
.dc_abc({dc_abc});
dc_abc("dc_abc");

Output_file:

.dc_abc (dc_ABC);
.dc_abc({dc_ABC});
dc_ABC("dc_abc");

Question 2

Is there any way in Perl that I can create two arrays like:

@match_pattern =(!dc_abc , dc_abc: ,dc_abc );
@ignore_pattern = (.dc_abc, {dc_abc});

If pattern belongs to @match_pattern, replace it with dc_ABC.
If pattern belongs to @ignore pattern, don't do anything.

Input file:

.dc_abc(dc_abc, {dc_abc});
!dc_abc(!dc_abc);
dc_abc: (dc_abc:);

Output file:

.dc_abc(dc_ABC , {dc_abc});
!dc_abc(dc_ABC);
dc_ABC (dc_ABC);

Soluzione

You can use a negative lookaround to match assertions. You can tell the regular expression that the part you want to match can't follow a literal dot or quote. The (?<! ) part is the negative lookbehind. Inside that, I have the pattern [."], which is a character class with the characters that can't precede the rest of the pattern:

use v5.10;

while( <DATA> ) {
    chomp;
    s/(?<![\."])dc_abc/dc_ABC/g;
    say;
    }


__END__
.dc_abc (dc_abc);
.dc_abc({dc_abc});
dc_abc("dc_abc");

This gives:

.dc_abc (dc_ABC);
.dc_abc({dc_ABC});
dc_ABC("dc_abc");

The lookarounds don't match any characters, and the negative lookbehinds in Perl must be fixed width (so, no quantifiers).

For question 2, Perl has all the tools you need to do it and leaves it up to you to finish up the logic. I don't particularly feel like thinking about it this late at night though. Maybe I'll think of something later. There are various brute force ways to deal with it, but there's probably something clever.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow