Question

i am new here at SO. I want to program a t-sql (sybase) parser which only listen at some relevant sql-statements.

Is it possible, to ignore non relevant statements, without to write the complete t-sql syntax in the grammar file. So that no errors like "line 8:2 mismatched input 'INSERT' expecting {EXEC, BEGIN, END, IF}" are coming.

My Input is the following sql stored procedure (only example;)):

CREATE PROCEDURE mySQL (@BaseLoglevel INT,
                    @ReleaseId INT,
                    @TargetSystem VARCHAR (5),
                    @IgnoreTimeStamp INT)                       
AS
BEGIN   
    INSERT INTO Departments
        (DepartmentID, DepartmentName, DepartmentHeadID)
        VALUES (600, 'Eastern Sales', 501)
    EXEC DoThis
    BEGIN
        EXEC DoSQLProc
    END
    if (@x=0) 
    begin
        exec DoSQL
    end
    else begin
        exec ReadTables
    end
    exec DoThat
    exec DoOther
END

So, in my grammar file is nothing which describes the insert statement. So i want to ignore this unknown stuff. Is it possible?

This is my grammar-file:

grammar Tsql;

/************Parser Rules*******************/
file : createProcedure sqlBlock;

createProcedure: CREATE PROC ID paramList? AS;

//Params of create procedure
paramList: LPAREN (sqlParam)(COMMA sqlParam)* RPAREN;
sqlParam: AT_SIGN ID sqlType; //(EQ defaultValue)?;
sqlType: (VARCHARTYPE | NUMERICTYPE | INTTYPE | CHARTYPE) length?;
length: LPAREN INT RPAREN;

sqlBlock : BEGIN sql* END;

sql:  sqlBlock
| sqlIf
| sqlExec               
;

sqlExec: EXEC ID (LPAREN sqlExprList? RPAREN)*  ; //SQLCall

//IF-rule
sqlIf: IF LPAREN sqlexpr RPAREN sqlIfBlock (sqlElseBlock)?;
sqlElseBlock: ELSE BEGIN sql* END;
sqlIfBlock: BEGIN sql* END;

/* T-SQL expressions */
sqlexpr
: ID LPAREN sqlExprList? RPAREN         # K
| AT_SIGN ID                            # SQLVar
| LPAREN sqlexpr RPAREN                 # SQLParens
| sqlexpr EQ INT                        # SQLEqual
| sqlexpr NOT_EQ INT                    # SQLNotEqual
| sqlexpr LTH sqlexpr                   # SQLLessThan
| sqlexpr GTH sqlexpr                   # SQLGreaterThan
| sqlexpr LEQ sqlexpr                   # SQLLessEqual
| sqlexpr GEQ sqlexpr                   # SQLGreaterEqual
| sqlexpr (PLUS|MINUS)                  # SQLAddSub
| sqlexpr (MUTLIPLY|DIVIDE)             # SQLMultDiv
| LPAREN sqlexpr RPAREN                 # SQLParens
| NOT sqlexpr                           # SQLNot
;

sqlExprList : sqlexpr (',' sqlexpr)* ;      // arg list

/************Lexer Rules*******************/
//createProcedure
CREATE : 'CREATE' | 'create';
PROC : 'PROCEDURE' | 'procedure';
AS : 'AS'|'as';
EXEC            : ('EXEC'|'exec');
//SqlTypes
INTTYPE: 'int'|'INT';
VARCHARTYPE : 'varchar'|'VARCHAR';
NUMERICTYPE : 'numeric'|'NUMERIC';
CHARTYPE : 'char'| 'CHAR';
//SqlBlock
BEGIN: 'BEGIN' | 'begin';
END: 'END' | 'end';
//If
IF: 'IF' | 'if';
ELSE : 'ELSE' | 'else';

RETURN          : ('RETURN' | 'return');
DECLARE         : ('DECLARE'|'declare');
AT_SIGN         : '@';


ID  :   LETTER (LETTER | [0-9])* ;

APOSTROPH       : [\'];
QUOTE           : ["];
LPAREN          : '(';
RPAREN          : ')';
COMMA           : ',';
SEMICOLON       : ';';
DOT             : '.';
EQ              : '=';
NOT_EQ          : ('!='|'<>');
LTH             : ('<');
GTH             : ('>');
LEQ             : ('<=');
GEQ             : ('=>');
RBRACK          : ']';
LBRACK          : '[';
PLUS            : '+';
MINUS           : '-';
MUTLIPLY        : '*';
DIVIDE          : '/';
COLON           : ':'; 

NOT             : ('NOT' | '!');

INT :   [0-9]+ ;

ML_COMMENT
:   '/*'.*? '*/' -> skip
;

SL_COMMENT
:   '//' .*? '\n' -> skip
;

WS : [ \t\r\n]+ -> skip;

fragment
LETTER : [a-zA-Z] ;

Many thanks in advance.

Was it helpful?

Solution

Is it possible, to ignore non relevant statements, without to write the complete t-sql syntax in the grammar file.

You could do something like this:

file
 : unit* EOF
 ;

unit
 : my_interesting_statement
 | . // any token
 ;    

my_interesting_statement
 : createProcedure sqlBlock
 | // other statements here?
 ;

// parser rules

// lexer rules

// Last lexer rule catches any character
ANY
 : .
 ;

The rule file will now match zero or more units. A unit will first try to match one of your my_interesting_statement, and when this is not possible, the last alternative in the unit rule, the ., will match just a single token (that is right: a . inside a parser rule matches a single token, not a single character).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top