Perhaps the following will be helpful:
use strict;
use warnings;
my @files = @ARGV;
pop;
my %file1 = map { chomp; /(.+),/; $1 => $_ } <>;
push @ARGV, $files[1];
my %file2 = map { chomp; /(.+),/; $1 => $_ } <>;
print "$files[0]:\n";
print $file1{$_}, "\n" for grep !exists $file2{$_}, keys %file1;
print "\n$files[1]:\n";
print $file2{$_}, "\n" for grep !exists $file1{$_}, keys %file2;
Usage: perl script.pl file1.txt file2.txt
Output on your datasets:
file1.txt:
cat,val 1,43432
file2.txt:
cat,val 3,22
bird,output,9999
This builds a hash for each file. The keys are the first two columns and the associated values are the full lines. grep
is used to filter the shared keys.
Edit: On relatively smaller files, using map
as above to process the file's lines will work fine. However, a list of all of the file's lines is first created, and then passed to map
. On larger files, it may be better to use a while (<>) { ...
construct, to read one line at a time. The code below does this--generating the same output as above--and uses a hash of hashes (HoH). Because it uses a HoH, you'll note some dereferencing:
use strict;
use warnings;
my %hash;
my @files = @ARGV;
while (<>) {
chomp;
$hash{$ARGV}{$1} = $_ if /(.+),/;
}
print "$files[0]:\n";
print $hash{ $files[0] }{$_}, "\n"
for grep !exists $hash{ $files[1] }{$_}, keys %{ $hash{ $files[0] } };
print "\n$files[1]:\n";
print $hash{ $files[1] }{$_}, "\n"
for grep !exists $hash{ $files[0] }{$_}, keys %{ $hash{ $files[1] } };