Question

At which point syntactic sugar is it usually recognized as syntactic sugar - parsing or later steps? Or at which point it's better to do it?

Assume that expression 'array[index]' is syntactic sugar for expression 'get_element(array,index)'.

  1. if it is recognized during parsing - then parse tree of 'array[index]' is identical to 'get_element(array,index)'

  2. if recognized during later steps - parse tree of 'array[index]' is distinct from 'get_element(array,index)'

Was it helpful?

Solution

tl;dr : @rici already gave the one-liner answer -- I'd like to expand on that a bit from my own experiences building parsers.

I used to build parsers that would go straight from input to AST in a single step. I no longer do that, because it ended up being a maintainability nightmare. So many different tasks -- string recognition, error reporting, context-sensitive constraints, tree construction -- were smashed together that the parser became a complete mess. It was very difficult to test and debug, and making changes was not fun. Plus, the correspondence between parser and grammar (which, in my opinion, is a very valuable feature of grammar-based parsing approaches) was lost.

My current approach is to build parsers that correspond exactly (hopefully) to the input grammar, and have a default CST (Concrete Syntax Tree) built at the same time. Yes, this means that the CST includes "junk" -- like open and closing braces -- but that means that the parser doesn't have to futz with tree-building at all.

Then, in a later state, the CST is converted to an AST. If there are any context-sensitive constraints (such as unique keys in an object literal), I try to check them all here (and keep them out of the parser). This step is where "junk" (i.e. concrete syntax such as braces) gets discarded. This step is also where I would desugar any concrete-syntactic sugar -- in fact, my definition of syntactic sugar is stuff that appears in the concrete but not in the abstract syntax.

The advantages that I've noticed from separating my mess into individual steps are that the parsing code is cleaner and shorter, and more declarative, and that the thing is more modular (meaning that I can change how the CST is mapped to an AST without having to even think about the parsing code).

Summary: I remove syntactic sugar* when converting from concrete syntax to abstract syntax, i.e. the sugar does not appear at all in the AST.


*: your definition of syntactic sugar may differ.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top