Question

I am very intrigued by the ability to add actions to ParseKit grammars. There is surprisingly little documentation on what is available in those actions. Say I have two rules like:

databaseName        = Word;
createTableStmt     ='CREATE' ('TEMP'| 'TEMPORARY')? 'TABLE' 'IF NOT EXISTS'? databaseName;

This obviously isn't a whole grammar but will serve as an example. When parsing i'd like to "return" a CreateTableStmt object that has certain properties. If I understand the tool correctly i'd add an action to the rule, do stuff then push it on the assembly which will carry it around for the next rule to deal with or use.

So for example it would look like:

createTableStmt     ='CREATE' ('TEMP'| 'TEMPORARY')? 'TABLE' 'IF NOT EXISTS'? databaseName;
{
    AnotherObj* dbName = Pop(); //gives me the top most object
    CreateTableStmt* createTable = [[CreateTableStmt alloc] initWith:dbName];
    //set if it was temporary
    // set 'IF NOT EXISTS'
    PUSH(createTable);//push back on stack for next rule to use
}

Then when everything is parsed I can just get that root object off the stack and it is a fully instantiated custom representation of the grammar. Somewhat like building an AST if i remember correctly. I can then do stuff with that representation much easier than with the passed in string.

My question is how can I see if it matched ('TEMP' | 'TEMPORARY') so I can set the value. Are those tokens on the stack? Is there a better way than to pop back to the 'CREATE' and see if we passed it. Should I be popping back to the bottom of the stack anyway on each match?

Also if my rule was instead

qualifiedTableName  = (databaseName '.')? tableName (('INDEXED' 'BY' indexName) | ('NOT' 'INDEXED'))?;

Is it correct to assume that the action would not be called until the rule had been matched? So in this case when the action is called to the stack could look like:

possibly:
|'INDEXED'
|'NOT'
or:
|indexName (A custom object possibly)
|'BY'
|'INDEXED

|tableName (for sure will be here)

and possibly these
|'.'            (if this is here I know the database name must be here) if not push last one on?
|databaseName
--------------(perhaps more things from other rules)

Are these correct assessments? Is there any other documentation on actions? I know it is heavily based on Antlr but its the subtle differences that can really get you in trouble.

Was it helpful?

Solution

Creator of ParseKit here. A few items:

ParseKit deprecation:

Just this week, I have forked ParseKit to a cleaner/smaller/faster library called PEGKit. ParseKit should be considered deprecated, and PEGKit should be used for all new development. Please move to PEGKit.

PEGKit is nearly identical to the grammar and code-gen features of ParseKit, and your ParseKit grammars are usable with PEGKit with a few small changes. In fact, all of the examples in your question here are usable with no changes in PEGKit.

See the Deprecation Notice in the ParseKit README.

And this tutorial on PEGKit.

Syntax errors in your grammar:

I spot 3 syntax errors in your grammar samples above (this applies equally to both ParseKit and PEGKit).

  1. This line:

    createTableStmt     ='CREATE' ('TEMP'| 'TEMPORARY')? 'TABLE' 'IF NOT EXISTS'? databaseName;

    Should be:

    createTableStmt     ='CREATE' ('TEMP'| 'TEMPORARY')? 'TABLE' ('IF' 'NOT' 'EXISTS')? databaseName;

    Notice the break up of the invalid 'IF NOT EXISTS' construct into individual literal tokens. This is not only necessary, but also desireable so that variable whitespace between the words is allowed.

  2. The POP() macro should be all upper case.

  3. Your createTableStmt rule is missing a semicolon at the very end (after the action's closing }).

Before Answering:

Make sure you are using v0.3.1 PEGKit or later (HEAD of master). I fixed an important bug while finding the answer to your question, and my solutions below require this fix.

Answer to your first question:

My question is how can I see if it matched ('TEMP' | 'TEMPORARY') so I can set the value?

Good question! You basically have the right idea in your further comments above.

Specficially, I would probably break up the createTableStmt rule into 4 rules like this:

createTableStmt = 'CREATE'! tempOpt 'TABLE'! existsOpt databaseName ';'!;

databaseName = QuotedString;

tempOpt 
    = ('TEMP'! | 'TEMPORARY'!)
    | Empty
    ;

existsOpt 
    = ('IF'! 'NOT'! 'EXISTS'!)
    | Empty
    ;
  • Notice all of the vital ! discard directives for discarding unneeded literal tokens.

  • Also Notice that I've changed the last two rules to use | Empty rather than ?. This is so I can add Actions to the Empty alternatives (you'll see that in a sec).

Then you can either add Actions to your grammar, or use ObjC parser delegate callbacks if you prefer to work in pure code.

If you use Actions in your grammar, something like the following will work:

createTableStmt = 'CREATE'! tempOpt 'TABLE'! existsOpt databaseName ';'!
{
    NSString *dbName = POP();
    BOOL ifNotExists = POP_BOOL();
    BOOL isTemp = POP_BOOL();
    NSLog(@"create table: %@, %d, %d", dbName, ifNotExists, isTemp);
    // go to town
    // myCreateTable(dbName, ifNotExists, isTemp);
};

databaseName = QuotedString
{
    // pop the string value of the `PKToken` on the top of the stack
    NSString *dbName = POP_STR();
    // trim quotes
    dbName = [dbName substringWithRange:NSMakeRange(1, [dbName length]-2)];
    // leave it on the stack for later
    PUSH(dbName);
};

tempOpt 
    = ('TEMP'! | 'TEMPORARY'!) { PUSH(@YES); }
    | Empty { PUSH(@NO); }
    ;

existsOpt 
    = ('IF'! 'NOT'! 'EXISTS'!) { PUSH(@YES); }
    | Empty { PUSH(@NO); }
    ;

I've added this Grammar and a test case to the PEGKit project.

As for your second question, please break it out as a new SO question, and tag it ParseKit and PEGKit and I will get to it ASAP.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top