For my application, I am trying to build an SQLite parser. As my application is using Objective-C, ParseKit seems like a good option. I read SQLite's syntax diagrams and built a grammar based on them. However, when I try to parse something using this grammar, the parser goes into infinite recursion.
The only statements I need are SELECT, INSERT, UPDATE, and DELETE (I need SELECT mostly because the others refer to it). My @start is designed to handle multiple statements separated by semicolons:
@start = statement (';' statement)*;
statement = select_stmt | insert_stmt | update_stmt | delete_stmt ;
The statements are as follows:
select_stmt = select_core ( compound_operator select_core )* order_expr? limit_expr? ;
select_core = 'select' ( 'distinct' | 'all' )? result_column ( ',' result_column )* from_expr where_expr group_expr ;
result_column = '*' | table_name '.' '*' | expr ( 'as'? column_alias )? ;
insert_stmt = 'insert' or_on_failure? 'into' table_name insert_expr ;
insert_expr = insert_expr_cols? (insert_expr_values | select_stmt) ;
insert_expr_cols = '(' column_name ( ',' column_name )* ')' ;
insert_expr_values = 'values' '(' expr ( ',' expr )* ')' ;
insert_expr_defaults = 'default' 'values' ;
update_stmt = 'update' or_on_failure? qualified_table_name update_expr where_expr? limited_expr? ;
update_expr = 'set' update_expr_col ( ',' update_expr_col )* ;
update_expr_col = column_name '=' expr ;
delete_stmt = 'delete' 'from' qualified_table_name where_expr? limited_expr? ;
And their supporting expressions:
order_expr = 'order' 'by' ordering_term (',' ordering_term)* ;
limit_expr = 'limit' expr ( ( 'offset' | ',' ) expr )? ;
from_expr = 'from' join_source ;
where_expr = 'where' expr ;
group_expr = 'group' 'by' expr ( ',' expr )? ( 'having' expr )? ;
join_source = single_source ( join_operator single_source join_constraint? )* ;
single_source = table_name 'as' table_alias indexed_by? | '(' select_stmt ')' ( 'as'? table_alias )? | '(' join_source ')' ;
or_on_failure = 'or' on_failure ;
on_failure = 'rollback' | 'abort' | 'replace' | 'fail' | 'ignore' ;
limited_expr = order_expr limit_expr ;
Names, aliases:
database_name = name ;
table_name = (database_name '.')? name ;
column_name = (table_name '.')? name ;
table_alias = name ;
column_alias = name ;
index_name = name ;
type_name = name+ ( '(' number (',' number)? ')' )? ;
function_name = name ;
collation_name = name ;
qualified_table_name = table_name indexed_by? ;
Miscellaneous operators, etc:
indexed_by = 'indexed' 'by' index_name | 'not' 'indexed' ;
unary_operator = symbol ;
binary_operator = symbol ;
compound_operator = 'union' 'all'? | 'intersect' | 'except' ;
join_operator = ',' | 'natural'? ( 'left' 'outer'? | 'inner' | 'cross' ) 'join' ;
join_constraint = 'on' expr | 'using' '(' column_name ( ',' column_name )* ')' ;
Basic types:
literal = number
| string
| 'null'
| 'current_time'
| 'current_data'
| 'current_timestamp' ;
number = Number ;
string = Word
| QuotedString ;
name = Word
| QuotedString ;
symbol = Symbol;
And the EXPR:
expr = literal
| column_name
| unary_operator expr
| expr binary_operator expr
| function_name '(' ( '*' | 'distinct'? expr ( ',' expr )* )? ')'
| '(' expr ')'
| 'cast' '(' expr 'as' type_name ')'
| expr 'collate' collation_name
| expr 'not'? ( 'like' | 'glob' | 'regexp' | 'match' ) expr ( 'escape' expr )?
| expr ( 'isnull' | 'notnull' | 'not' 'null' )
| expr 'is' 'not'? expr
| expr 'not'? 'between' expr 'and' expr
| expr 'not'? 'in' ( table_name | '(' ( select_stmt | expr ( ',' expr )* )? ')' )
| ( 'not'? 'exists' )? '(' select_stmt ')'
| 'case' expr? ( 'when' expr 'then' expr )+ ( 'else' expr )? 'end' ;
When I stepped through the code, the basic path was @start -> statement -> select_stmt -> select_core -> result_column -> expr -> expr -> expr...
After about 8-9k calls between PKParser's matchAndAssemble: and PKParser/Subclass's allMatchesFor:, something dies, usually from a EXC_BAD_ACCESS error (and then LLDB can't do anything either).
P.S.: If you're going to post an answer saying, 'Oh, you should really do/use this,' A) I like Objective-C. Don't tell me not to use it. It's my choice. My response will likely be a rant. And B) I tried digging through SQLite's source to use their parser. I never got anywhere. If you think I should use that, kindly post the source for their parser as a single file with no other dependencies.