문제

I am trying to do a syntax text corrector for my compilers' class. The idea is: I have some rules, which are inherent to the language (in my case, Portuguese), like "A valid phrase is SUBJECT VERB ADJECTIVE", as in "Ruby is great".

Ok, so first I have to tokenize the input "Ruby is great". So I have a text file "verbs", with a lot of verbs, one by line. Then I have one text "adjectives", one "pronouns", etc.

I am trying to use Ragel to create a parser, but I don't know how I could do something like:

%%{
  machine test;
  subject = <open-the-subjects-file-and-accept-each-one-of-them>;
  verb = <open-the-verbs-file-and-accept-each-one-of-them>;
  adjective = <open-the-adjective-file-and-accept-each-one-of-them>;
  main = subject verb adjective @ { print "Valid phrase!" } ;
}%%

I looked at ANTLR, Lex/Yacc, Ragel, etc. But couldn't find one that seemed to solve this problem. The only way to do this that I could think of was to preprocess Ragel's input file, so that my program reads the file and writes its contents at the right place. But I don't like this solution either.

Does anyone knows how I could do this? There's no problem if it isn't with Ragel, I just want to solve this problem. I would like to use Ruby or Python, but that's not really necessary either.

Thanks.

도움이 되었습니까?

해결책

If you want to read the files at compile time .. make them be of the format:

subject = \
ruby|\
python|\
c++

then use ragel's 'include' or 'import' statement (I forget which .. must check the manual) to import it.


If you want to check the list of subjects at run time, maybe just make ragel read 3 words, then have an action associated with each word. The action can read the file and lookup if the word is good or not at runtime.

The action reads the text file and compares the word's contents.

%%{
machine test

action startWord {
    lastWordStart = p;
}
action checkSubject {
   word = input[lastWordStart:p+1]  
   for possible in open('subjects.txt'):
       if possible == word:
           fgoto verb
   # If we get here do whatever ragel does to go to an error or just raise a python exception 
   raise Exception("Invalid subject '%s'" % word)
}
action checkVerb { .. exercise for reader .. ;) }
action checkAdjective { .. put adjective checking code here .. }

subject = ws*.(alnum*)>startWord%checkSubject
verb := : ws*.(alnum*)>startWord%checkVerb
adjective := ws*.)alnum*)>startWord%checkAdjective
main := subject;
}%%

다른 팁

With bison I would write the lexer by hand, which lookup the words in the predefined dictionary.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top