How do I read in an editable file that contains words that I don't want stemmed using Lingua::Stem's add_exceptions($exceptions_hash_ref) in perl?

StackOverflow https://stackoverflow.com/questions/12062182

  •  27-06-2021
  •  | 
  •  

Question

I am using Perl's Lingua::Stem module (Lingua::Stem) and I want to have a text file or other editable file format to contain a list of words I do not want stemmed. I want to be able to add words to the file any time.

Their example shows:

add_exceptions($exceptions_hash_ref);

What is the best way to do this?

I used their method in hard coding some exceptions, but I want to do this with a file.

# adding default exceptions
Lingua::Stem::add_exceptions({ 'emily' => 'emily',
                            'driven' => 'driven',
                        });
Was it helpful?

Solution

Assuming your "editable" file is whitespace separated, like so:

emily emily
driven driven

Your code could be:

open my $fh, "<", "excep.txt" or die $!;
my $href = { map split, <$fh> };
Lingua::Stem::add_exceptions($href);

OTHER TIPS

You can define a function to load exceptions from the given file:

sub load_exceptions {
  my $fname = shift;
  my %list;
  open (my $in, "<", $fname) or die("load_exceptions: $fname");
  while (<$in>) {
    chomp;
    $list{$_} = $_; 
  }
  close $in;
  return \%list;
}

And use it:

Lingua::Stem::add_exceptions(load_exceptions("notstem.txt"));

Example input file:

emily 
driven
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top