Question

I have an app in C++ which actually processes a binary file. The binary file is a collection of events say A/B/C, and on detecting event A in the file, the app handles the event in "handler A".

Now i need to write another script in a custom language, which gets executed orthogonally to the binary file processing. The script can have something like,

define proc onA
{
 c= QueryVariable(cat)
print ( c )
}

So when the app handles the event "A" from the binary file, the app has to parse this script file, check for OnA and convert the statements in OnA proc to routines supported by the app. For eg, QueryVariable should copy the value of variable "cat" defined in the app to the variable "C". The app should also check for syntax/semantics of the language in script. Where can i get the best info for deciding on the design? My knowledge on parse trees/grammar has really weakened.

Thanks

Was it helpful?

Solution

An easy way to build an interpreter:

  • Define a parser for the language from its syntax
  • Build an abstract syntax tree AST
  • Apply a visitor function is traverse the AST in preorder and "execute" actions suggested by the AST nodes.

Some AST nodes will be "definitional", e.g., will declare the existence of some named entity such as your "define proc onA " phrase above. Typically the action is to associate the named entity with the content, e.g., form a triplet <onA,proc,<body>> and store this away in a symbol table indexed by the first entry. This makes finding such definitions easier.

Later, when your event process encounters an A event, your application knows to look up "onA" in this symbol table. When found, the AST is traversed by the visitor function to execute its content. You'll usually need a value stack to record intermediate expression values, with AST leaves representing operands (variables, constants) pushing values onto that stack, and operators (+, -, <=) popping values off and computing new results to push. Assignment operations take the top stack value and put into the symbol table associated with the identifier name. Control operators (if, do) take values off the top of the stack and use them to guide what part off the program (e.g., what subtree) to execute next.

All of this is well known and can be found in most books on compilers and interpreters. Peter Brown's book on this is particularly nice even though it seems relatively old:

Writing Interactive Interpreters and Compilers.

OTHER TIPS

There must be some interpreter or compiler for the scripting language. Check if it supports embedding in C or C++. Most script languages do.

Next choice, or perhaps first, would be to just run the script externally, using the existing compiler/interpreter.

I can't think of any reason why one of the first two options won't do, but if not, consider building an interpreter using ANTLR or for a small language Boost Spirit. Disclaimer: I haven't used the first, and I've only tried out Boost Spirit for a small toy example.

Cheers & hth.,

PS: If you can choose the script language, consider JavaScript and just use Google's reportedly excellent embedding API.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top