Question

Ok so I'm still learning the command line stuff like grep and diff and their uses within the scope of my project, but I can't seem to wrap my head around how to approach this problem.

So I have 2 files, each containing hundreds of 20 character long strings. lets call the files A and B. I want to search through A and, using the values in B as keys, locate UNIQUE String entries that occur in A but not in B(there are duplicates so unique is the key here)

Any Ideas?

Also I'm not opposed to finding the answer myself, but I don't have a good enough understanding of the different command line scripts and their functions to really start thinking of how to use them together.

Était-ce utile?

La solution

There are two ways to do this. With comm or with grep, sort, and uniq.

comm

comm afile bfile

comm compares the files and outputs 3 columns, lines only in afile, lines only in bfile, and lines in common. The -1, -3 switches tell comm to not print out those columns.

grep sort uniq

grep -F -v -file bfile afile | sort | uniq

or just

grep -F -v -file bfile afile | sort -u

if your sort handles the -u option.

(note: the command fgrep if your system has it, is equivalent to grep -F.)

Autres conseils

Look up the comm command (POSIX comm ) to do this. See also Unix command to find lines common in two files.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top