How do I read in an editable file that contains words that I don't want stemmed using Lingua::Stem's add_exceptions($exceptions_hash_ref) in perl?
Question
I am using Perl's Lingua::Stem module (Lingua::Stem) and I want to have a text file or other editable file format to contain a list of words I do not want stemmed. I want to be able to add words to the file any time.
Their example shows:
add_exceptions($exceptions_hash_ref);
What is the best way to do this?
I used their method in hard coding some exceptions, but I want to do this with a file.
# adding default exceptions
Lingua::Stem::add_exceptions({ 'emily' => 'emily',
'driven' => 'driven',
});
Solution
Assuming your "editable" file is whitespace separated, like so:
emily emily
driven driven
Your code could be:
open my $fh, "<", "excep.txt" or die $!;
my $href = { map split, <$fh> };
Lingua::Stem::add_exceptions($href);
OTHER TIPS
You can define a function to load exceptions from the given file:
sub load_exceptions {
my $fname = shift;
my %list;
open (my $in, "<", $fname) or die("load_exceptions: $fname");
while (<$in>) {
chomp;
$list{$_} = $_;
}
close $in;
return \%list;
}
And use it:
Lingua::Stem::add_exceptions(load_exceptions("notstem.txt"));
Example input file:
emily
driven
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow