Here is a solution that matches every occurrence of all keywords:
#!usr/bin/perl
use strict;
use warnings;
#Lexical variable for filehandle is preferred, and always error check opens.
open my $keywords, '<', 'keywords.txt' or die "Can't open keywords: $!";
open my $search_file, '<', 'search.txt' or die "Can't open search file: $!";
my $keyword_or = join '|', map {chomp;qr/\Q$_\E/} <$keywords>;
my $regex = qr|\b($keyword_or)\b|;
while (<$search_file>)
{
while (/$regex/g)
{
print "$.: $1\n";
}
}
keywords.txt:
hello
foo
bar
search.txt:
plonk
food is good
this line doesn't match anything
bar bar bar
hello world
lalalala
hello everyone
Output:
4: bar
4: bar
4: bar
5: hello
7: hello
Explanation:
This creates a single regex that matches all of the keywords in the keywords file.
<$keywords>
- when this is used in list context, it returns a list of all lines of the file.
map {chomp;qr/\Q$_\E/}
- this removes the newline from each line and applies the \Q...\E
quote-literal regex operator to each line (This ensures that if you have a keyword like "foo.bar" it will treat the dot as a literal character, not a regex metacharacter).
join '|',
- join the resulting list into a single string, separated by pipe characters.
my $regex = qr|\b($keyword_or)\b|;
- create a regex that looks like this:
/\b(\Qhello\E|\Qfoo\E|\Qbar\E)\b/
This regex will match any of your keywords. \b
is the word boundary marker, ensuring that only whole words match: food
no longer matches foo
. The parentheses capture the specific keyword that matched in $1
. This is how the output prints the keyword that matched.
I updated the solution to match each keyword on a given line and to only match complete words.