This is a very common use for awk
:
$ awk 'FNR==NR{a[$0];next}!($0 in a)' file1 file2
alice
chris
elvis
It's easier just rewrite the whole file3
then just updating it:
$ awk 'FNR==NR{a[$0];next}!($0 in a)' file1 file2 > file3
Explanation:
NR
is an awk
variable incremented after every record is read, FNR
is similar but gets reset to 1 everytime a new file is read. NR==FNR
can therefor only be True when reading the first file. When reading the first file we create an array a
where the keys in the array are the lines in the file, as well as storing all the lines from file1 this will remove any duplicates. next
is a command that makes sure no further blocks get executed on the current record. Once file1
has been read we just check if the current line in file2
is found in the array (i.e was in file1
). The condition !($0 in a)
has no block to executed so by default awk
executed {print $0}
.
There is plenty wrong with your script, the best thing to do would be to read Effective Awk Programming if you want to learn awk
.