How can I remove common occurrences between 2 text files using the unix environment?

StackOverflow https://stackoverflow.com/questions/21293807

سؤال

Ok so I'm still learning the command line stuff like grep and diff and their uses within the scope of my project, but I can't seem to wrap my head around how to approach this problem.

So I have 2 files, each containing hundreds of 20 character long strings. lets call the files A and B. I want to search through A and, using the values in B as keys, locate UNIQUE String entries that occur in A but not in B(there are duplicates so unique is the key here)

Any Ideas?

Also I'm not opposed to finding the answer myself, but I don't have a good enough understanding of the different command line scripts and their functions to really start thinking of how to use them together.

هل كانت مفيدة؟

المحلول

There are two ways to do this. With comm or with grep, sort, and uniq.

comm

comm afile bfile

comm compares the files and outputs 3 columns, lines only in afile, lines only in bfile, and lines in common. The -1, -3 switches tell comm to not print out those columns.

grep sort uniq

grep -F -v -file bfile afile | sort | uniq

or just

grep -F -v -file bfile afile | sort -u

if your sort handles the -u option.

(note: the command fgrep if your system has it, is equivalent to grep -F.)

نصائح أخرى

Look up the comm command (POSIX comm ) to do this. See also Unix command to find lines common in two files.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top