Substitute a sentence of a text, with the corresponding sentence of another text using Perl

https://stackoverflow.com/questions/22323392

12-06-2023
|

Question

I've a text file like this

mc1s2  L#'|NA|det indice|indice|nc Sensex|NA|adj
progressait|progresser|v de|de|prep

and another file text like this

programmer:_[1]_:_P0_(P1)=1 progresser:_[1]_:_P0=1 
prohiber:_[1]_:_P0_P1=1
projeter:_[3]_:_P0_P1=1;_:_P0_P1_(PL)=1;_:_P0_P1_(PP<sur>)=1

I would like to have a replace in order to create a third file text like this

mc1s2  L#'|NA|det indice|indice|nc Sensex|NA|adj
progresser:_[1]_:_P0=1 de|de|prep As you can see I'd like to replace
progressait|progresser|v with progresser:_[1]_:_P0=1.

I would like to do this for all verbs.

This script answer to my exigence but I can't understand the last part of it

use strict;
use warnings;
use autodie;

my $lookupfile = 'lookup.txt';
# Contains:
# programmer:_[1]_:_P0_(P1)=1
# progresser:_[1]_:_P0=1 
# prohiber:_[1]_:_P0_P1=1
# projeter:_[3]_:_P0_P1=1;_:_P0_P1_(PL)=1;_:_P0_P1_(PP<sur>)=1

my $datafile = 'data.txt';
# Contains:
# mc1s2  L#'|NA|det indice|indice|nc Sensex|NA|adj progressait|progresser|v de|de|prep 

my %lookup;
open my $fh, '<', $lookupfile;
while (<$fh>) {
    chomp;
    my ($field) = split ':';
    $lookup{$field} = $_;
}

# use Data::Dump; # Used to debug the lookup table.
# dd \%lookup;

open $fh, '<', $datafile; while (<$fh>) {
    s{(?<=\s)(\S+)} {
        my $entry = $1;
        my @fields = split '\|', $entry;
        $lookup{$fields[1]} // $entry;
    }eg;

    print;
}

I can't understand this :

open $fh, '<', $datafile;
while (<$fh>) {
    s{(?<=\s)(\S+)}{
        my $entry = $1;
        my @fields = split '\|', $entry;
        $lookup{$fields[1]} // $entry;
    }eg;

Can you help me?

Solution

This substitution

s{(?<=\s)(\S+)}{
    my $entry = $1;
    my @fields = split '\|', $entry;
    $lookup{$fields[1]} // $entry;
}eg;

uses the /e modifier, which indicates that the replacement string is not to be used directly, but executed as Perl code to generate the string to replace the match.

The match finds the next sequence of non-space characters that follow a space character, so in this case $1 is initially set to L#'|NA|det
$1 is copied to $entry, and $entry is split on the pipe characters | into @fields
The %lookup hash is indexed with $fields[1] - the second entry in @fields. Here that is the string NA
The code block returns the value of that hash element, or the whole of $entry if there was no hash element with that key. Note that, because $entry is the whole of the matched string, there is no change unless a corresponding element is found in %lookup because the string is replaced with itself

I hope this helps

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow