Creator of ParseKit here.
Context:
ParseKit is currently undergoing a bit of a redesign. There are two ways to use ParseKit: the old way and the new way.
OLD WAY (dynamic): Previously, ParseKit produced dynamic, non-deterministic parsers at runtime (this code is still available in the library). Producing these parsers was slow, and the parsers produced were very slow as well (although they had some interesting properties which are useful in very rare circumstances).
NEW WAY (static): Now, using the ParserGenApp (as you've described here), ParseKit produces static ObjC source code for deterministic (PEG) memoizing parsers. Source code is produced at design time which you can then compile into your project. The parsers produced are fast. This new option is now the preferred method of using ParseKit. The old method will be deprecated somehow.
I will assume you are using the new static method (it sounds like you are already from your question).
Answer:
Here are two ways to match "words" in your ParseKit grammars:
- Use the built-in
Word
rule reference which is equivalent to[_a-zA-Z][-'_a-zA-Z0-9]*
, and will usually do what you want:username = Word;
- If the built-in
Word
terminal does not match exactly what you want, prefix it with a Syntactic Predicate (idea/syntax stolen from ANTLR3) containing a Regex in theMATCHES()
macro:username = { MATCHES(@"[c-xB-Y]", LS(1)) }? Word;
OR:
username = { MATCHES_IGNORE_CASE(@"[a-z]", LS(1)) }? Word;
Notes:
- The Syntactic Predicate syntax is
{ ... }?
placed just before a rule reference (Word
in this case). - You may use any ObjC code inside the Syntactic Predicate, but it must return a boolean value.
MATCHES()
,MATCHES_IGNORE_CASE()
, andLS()
are just C macros I have made available for convenience. - If the ObjC code inside the Predicate is longer than a single expression, you must use semicolons to terminate statements as normal. Remember to return a boolean value.
- The
MATCHES()
andMATCHES_IGNORE_CASE()
macros are a shortcut for usingNSRegularExpression
. - The
LS()
macro stands for L ookahead S tring.LS(1)
means fetch theNSString
value of the first lookahead token. In this case the first lookahead token will be the token matched byWord
. To look ahead two or three tokens, you would useLS(2)
,LS(3)
and so forth. - The handy old Regex literal syntax is not (yet) available in grammars used with the new static ParseKit (aka ParserGenApp) as it was in the old dynamic ParseKit. I would like to add that eventually.