Question

I know that programming languages can be defined in EBNF which can be converted into regular expressions. Right now I am working on a very simple BASIC interpreter for a project. The code has to be entered in a gui which should validate the syntax to later transfer the code to an embedded system where it is executed.

I was googling to find an article or tutorial on writing a validator for this job but I could not really find such a thing. Is it just defining the regular expressions and try to match them?

Note: the GUI part is written in Java while the embedded code is written in C++.

Was it helpful?

Solution

Your initial premise that an EBNF language description can be converted to regular expressions is incorrect. The set of languages that can be parsed with regular expressions is a subset of the set of languages that can be described in EBNF.
For example, it is impossible to write a regular expression to check if nested parentheses are balanced.

The best way to validate your language input is to write a parser for it. There are also parser generators (a-la yacc/bison) for Java.

OTHER TIPS

Code validation should not only be syntactical, but more importantly should take some of the semantics, which is much harder. Read about static program analysis, type inference, etc...

For your project, did you consider embedding an existing interpreter (e.g. guile or Lua, etc...) inside your program?

If you want to write an interpreter, read about domain specific languages. See also this answer to a related question.

Licensed under: CC-BY-SA with attribution
scroll top