Question

The Problem

Suppose I have a text file containing a list of words. Each word appears on a separate line. Let's take the following as an example and we'll call it "my_dictionary_file":

my_dictionary_file.txt

Bill
Henry
Martha
Sally
Alex
Paul

In my current directory, I have several files which contain the above names. The problem is that I do not know which files contain which names. This is what I'd like to find out; a sort of matching game. In other words, I want to match each name in my_dictionary_file.txt to the file in which the name appears.

As an example, let's say that the files in my working directory look like the following:

file1.txt

There is a man called Bill. He is tall.

file2.txt

There is a girl called Martha. She is small.

file3.txt

Henry and Sally are a couple.

file4.txt

Alex and Paul are two bachelors.

What I've tried

First. Using the fgrep command with the -o and -f options,

$ fgrep -of my_dictionary_file.txt file1.txt
Bill

I can identify that the name Bill can be found in file1.txt.

Second. Using the fgrep command with the -r -l and -f options,

$ fgrep -rlf names.txt .
./names.txt
./file1.txt
./file4.txt
./file3.txt
./file2.txt

I can search through all of the files in the current directory to find out if the files contain the list of names in my_dictionary_file.txt


The sought-after solution

The solution that I am looking for would be along the lines of combining both of the two attempts above. To be more explicit, I'd like to know that:

Bill belongs to file1.txt
Martha belongs to file2.txt
Henry and Sally belong to file3.txt
Alex and Paul belong to file4.txt

Any suggestions or pointers towards commands other than fgrep would be greatly appreciated!


Note

The actual problem that I am trying to solve is a scaled up version of this simplified example. I'm hoping to base my answer on responses to this question, so bear in mind that in reality the dictionary file contains hundreds of names and that there are a hundred or more files in the current directory.

Typing

$ fgrep -of my_dictionary_file.txt file1.txt
Bill

$ fgrep -of my_dictionary_file.txt file2.txt
Martha

$ fgrep -of my_dictionary_file.txt file3.txt
Henry Sally

$ fgrep -of my_dictionary_file.txt file4.txt
Alex Paul

does, of course, get me the results, but I'm looking for an efficient method to collect the results for me - perhaps, pipe the results to a single .txt file.

Was it helpful?

Solution

If you fgrep all the files at once with the -o option, fgrep should print both the file name and the text that matched:

$ fgrep -of dict.txt file*.txt
file1.txt:Bill
file2.txt:Martha
file3.txt:Henry
file3.txt:Sally
file4.txt:Alex
file4.txt:Paul
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top