How to get the name of the input file in a Perl one-liner?
-
08-10-2019 - |
Question
cat monday.csv
223.22;1256.4
227.08;1244.8
228.08;1244.7
229.13;1255.0
227.89;1243.2
224.77;1277.8
cat tuesday.csv
227.02;1266.3
227.09;1234.9
225.18;1244.7
224.13;1255.3
228.59;1263.2
224.70;1247.6
This Perl one-liner gives me the row with the highest value in the second column from the rows where in the first column the first 3 digits are 227 or 226 from the file "monday.csv":
$ perl -F\; -ane '$hash{$_} = $F[1] if /22[78]/; END{ print and exit for sort{ $hash{$b} <=> $hash{$a} } keys %hash }' monday.csv
This Perl one-liner gives me the row with the highest value in the second column from the rows where in the first column the first 3 digits are 227 or 226 from all *day.csv files :
$ perl -F\; -ane '$hash{$_} = $F[1] if /22[78]/; END{ print and exit for sort{ $hash{$b} <=> $hash{$a} } keys %hash }' *day.csv
How could I rewrite this one-liner to get an output:
filename : "row with the highest value in the second column from the rows where in the first column the first 3 digits are 227 or 226 from the file 'filename.csv'"
for each *day.csv
file?
Solution
You can use $ARGV
for the current file name. If you're only interested in the max, no need to store all the values and then sort them; instead, just store the max for each file. Also, your regex probably should be anchored to the start of the line.
# Line breaks added for display purposes.
perl -F\; -ane '
$max{$ARGV} = $F[1] if /^22[78]/ and $F[1] > $max{$ARGV};
END{ print "$_\t$max{$_}" for sort keys %max}
' *day.csv
Or, if you want to store the entire line where the max occurs:
perl -F\; -ane '
($max{$ARGV}{ln}, $max{$ARGV}{mx}) = ($_, $F[1])
if /^22[78]/ and $F[1] > $max{$ARGV}{mx};
END{ print "$_\t$max{$_}{ln}" for sort keys %max}
' *day.csv
OTHER TIPS
The filename is contained in the $ARGV
variable:
$ARGV
contains the name of the current file when reading from <>.
However, the one-liners presented have an issue; what if you have repeated values of your first column?
A better one-liner would be:
$ perl -F/;/ -MList::Util=max -lane 'push @{ $wanted{$ARGV} }, $F[1] if $F[0] =~ /22[78]/; } END { print "$ARGV : ", max(@{ $wanted{$_} }) for keys %wanted;' *.csv
Based on the comment:
$ perl -F/;/ -lane '$wanted{$ARGV} = \@F if $F[1] >= $wanted->{$ARGV}[1] && $F[0] =~ /22[78]/; } END { print "$_ : @$wanted{$_}" for keys %wanted;' *.csv
Seems that you can use $ARGV. See "current filename"
If I would like the whole row, I could do this (based on FM's answer):
perl -F\; -ane '$max{$ARGV} = $_ if /^22[78]/ and $F[1] >= (split /;/, $max{$ARGV})[1]; END{ print "$_\t$max{$_}" for sort keys %max}' *day.csv
I found a shorter solution.
all files:
perl -F\; -anE '$max{$ARGV} = [@F] if /^22[78]/ and $F[1] >= $max{$ARGV}->[1]; END{ print "$_\t@{$max{$_}}" for sort keys %max}' *day.csv
one file:
perl -F\; -anE '$max = [@F] if /^22[78]/ and $F[1] >= $max->[1]; END{ print "@$max" }' monday.csv
or if there is not much space available
perl -F\; -anE'$m{$ARGV}=[@F]if/^22[78]/&&$F[1]>=$$m{$ARGV}[1]}print"$_\t@{$m{$_}}"for sort keys%m;{' *day.csv
perl -F\; -anE'$m=[@F]if/^22[78]/&&$F[1]>=$$m[1]}print"@$m";{' monday.csv
As Zaid revealed: to get the last row with the highest value in case of repeated highest values in a file I changed the "$F[1] > $max..."-part to "$F[1] >= $max".