Вопрос

I have two files contain (unique) strings: the first is 1.txt and has ~1000,000 unique strings and 2.txt has ~10,000 unique strings. I want to remove all occurences of 2.txt strings that are in 1.txt, so, when I search for occurances of 2.txt in 1.txt (using comm command) I found that almost all 2.txt is inside 1.txt I am using the following command (grep -Ev -f 2.txt 1.txt) this command should give me an output of 9,990,000 string but it gives me only 95,000 strings

I am wondering if this is a bug in grep or I missed something!

Это было полезно?

Решение

The -E argument tells grep the patterns are extended regular expressions and it seems you want to use them as fixed strings, set by the -F argument. Please try the following command instead:

grep -Fv -f 2.txt 1.txt
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top