Matching plural words in Treetop
Question
Is there a way to programmatically match plural words using Treetop. The Linguistics gem will pluralize a word, but how can that be inserted back into the parser.
Here's an example of what I'm trying to do:
#!/usr/bin/env ruby
require 'treetop'
require 'linguistics'
include Linguistics::EN
Treetop.load_from_string DATA.read
parser = RecipeParser.new
p parser.parse('cans')
__END__
grammar Recipe
rule units
unit &{|s| plural(s[0].text_value) }
end
rule unit
'can'
end
end
Solution
In general, the linguistics gem can't pluralize arbitrary Treetop rule definitions—they're not strings.
Using semantic predicates your recipe.treetop
file could define all your valid singular unit
strings in an array, pluralize them, and then create a rule that compares the token in question to each of those pluralized units:
require "linguistics"
grammar Recipe
rule units
[a-zA-Z\-]+ &{ |u|
Linguistics.use(:en)
singular_units = [ "can" ]
singular_units.
map(&:en).
map(&:plural).
include?(u[0].text_value)
}
end
end
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow