Compare two files and make union

https://stackoverflow.com/questions/18751894

28-06-2022
|

Question

I tested a line below to compare 1st columns in 2 files and make an union. However the different value with identical 1st column in file2 was eliminated. Below I attached sample files, obtained result, and desired result. Would you please assist me.

awk -F, 'BEGIN{OFS=","}FNR==NR{a[$1]=$1","$2;next}($1 in a && $2=$2","a[$1])' file2.csv file1.csv >testout.txt

file1
John,red
John,blue
Mike,red
Mike,blue
Carl,red
Carl,blue

file2
John,V1
John,V2
Kent,V1
Kent,V2
Mike,V1
Mike,V2

obtained result
John,red,John,V2
John,blue,John,V2
Mike,red,Mike,V2
Mike,blue,Mike,V2

desired result
John,red,John,V1
John,red,John,V2
John,blue,John,V1
John,blue,John,V2
Mike,red,Kent,V1
Mike,red,Kent,V2
Mike,blue,Kent,V1
Mike,blue,Kent,V2

La solution

try this one-liner:

 awk -F, -v OFS="," 'NR==FNR{a[$0];next}{for(x in a)if(x~"^"$1FS)print $0,x}' file2 file1

test:

kent$  awk -F, -v OFS="," 'NR==FNR{a[$0];next}{for(x in a)if(x~"^"$1FS)print $0,x}' f2 f1
John,red,John,V1
John,red,John,V2
John,blue,John,V1
John,blue,John,V2
Mike,red,Mike,V1
Mike,red,Mike,V2
Mike,blue,Mike,V1
Mike,blue,Mike,V2

Autres conseils

Using join could do that:

join -t, -1 1 -2 1 --nocheck-order -o 1.1 1.2 2.1 2.2 file1 file2

Output:

John,red,John,V1
John,red,John,V2
John,blue,John,V1
John,blue,John,V2
Mike,red,Mike,V1
Mike,red,Mike,V2
Mike,blue,Mike,V1
Mike,blue,Mike,V2

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow