Question

I need to check spells and grammars in texts so I started using LanguageTool API (Can be found here). Now, when I am writing the start-up code provided by them as follows-

JLanguageTool langTool = new JLanguageTool(Language.ENGLISH);
langTool.activateDefaultPatternRules();
List<RuleMatch> matches = langTool.check("Eat I rice" +
    "every day and go school to good as a boy");
for (RuleMatch match : matches) {
  System.out.println("Potential error at line " +
      match.getEndLine() + ", column " +
      match.getColumn() + ": " + match.getMessage());
  System.out.println("Suggested correction: " +
      match.getSuggestedReplacements());
}

I don't get any error. Sorry if I am wrong but is the sentence "Eat I rice every day and go school to good as a boy" a correct sentence (grammatically)? If so, or if not, then is there any way to detect such sentences (meaningless and or grammatically incorrect) with the tool?

Was it helpful?

Solution

Languagetool is rule based. Obviously the sentence "Eat I rice every day and go school to good as a boy" is not catched by any of the rules yet.

http://wiki.languagetool.org/tips-and-tricks has the info on how to add user-defined rules to Languagetool.

Here is an example of such a rule:

<rule>
  <pattern>
    <token>
      <exception regexp="yes">(that|ha[ds]|will|must|could|can|should|would|does|did|may|might|t|let)</exception>
      <exception inflected="yes" regexp="yes">feel|hear|see|watch|prevent|help|stop|be</exception>
      <exception postag="C[CD]|IN|DT|MD|NNP|\." postag_regexp="yes"></exception>
      <exception scope="previous" postag="PRP$"/>
    </token>
    <token postag="NNP" regexp="yes">.{2,}<exception postag="JJ|CC|RP|DT|PRP\$?|NNPS|NNS|IN|RB|WRB|VBN" postag_regexp="yes"></exception></token>
    <marker>
      <token postag="VB|VBP" postag_regexp="yes" regexp="yes">\p{Lower}+<exception postag="VBN|VBD|JJ|IN|MD" postag_regexp="yes"></exception></token>
    </marker>
    <token postag="IN|DT" postag_regexp="yes"></token>
  </pattern>
  <message>The proper name in singular (<match no="2"></match>) must be used with a third-person verb: <suggestion><match no="3" postag="VBZ"></match></suggestion>.</message>
  <short>Grammatical problem</short>
  <example correction="walks" type="incorrect">Ann <marker>walk</marker> to the building.</example>
  <example type="correct">Bill <marker>walks</marker> to the building.</example>
  <example type="correct">Guinness <marker>walked</marker> to the building.</example>
  <example type="correct">Roosevelt and Hoover speak each other's lines.</example>
  <example type="correct">Boys are at higher risk for autism than girls.</example>
  <example type="correct">In reply, he said he was too old for this.</example>
  <example type="correct">I can see Bill looking through the window.</example>
  <example type="correct">Richard J. Hughes made his Morris County debut in his bid for the Democratic gubernatorial elections.</example>
  <example type="correct">... last night got its seven-concert Beethoven cycle at Carnegie Hall off to a good start.</example>
  <example type="correct">... and through knowing Him better to become happier and more effective people.</example>
  <!-- TODO: Fix false-positive: The library and Medical Center are to the north.-->
  <!-- The present Federal program of vocational education began in 1917. -->
</rule>

There is an online rule editor available at

http://community.languagetool.org/ruleEditor2/

A simple solution to the problem would be

<!-- English rule, 2014-09-19 --> 
<rule id="ID" name="EatI"> 
   <pattern> <token>Eat</token> <token>i</token> </pattern> 
   <message>Instead of <match no="2"/> <match no="1"/> it should be <match no="1"/> <match no="2"/></message> 
  <url>http://stackoverflow.com/questions/13016469/detecting-meaningless-and-or-grammatically-incorrect-sentence-with-languagetool/25933907#25933907</url> 
  <short>wrong order of verb and nown</short> 
  <example type='incorrect'><marker>Eat i</marker> rice</example> <example type='correct'>I eat rice</example> 
</rule>

but of course this would only cover the verb "Eat" - but I hope you get the picture how it works ...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top