Question

I'd like to define an unordered set of values using an Extended Backus-Naur Form (EBNF) context free grammar. It's easy to define an unordered list of values in EBNF, for example:

value = 'A' | 'B' | 'C';
list = value, {',', value};

However, I'm having doubts it can be done for an unordered set.

Below are examples of valid unordered sets of values:

A, B, C, D
A, B, D, C
A, D, C, B
...
D, C, B, A

Whilst invalid lists would be:

A, A, C, D
B, C, C, B
A, A, A, A
...

or lists of arbitrary length.

A, A, B, C, D, A
A, B, C, D, A, B, C, D
...
Was it helpful?

Solution

You can define unordered sets in EBNF, but only by listing all possible enumerations. That's impractical for sets larger than about two elements.

The standard -- to the extent that EBNF is a standardized notation -- allows you to use English (or any other language you feel comfortable with) to describe a sequence which is not otherwise describable. For example, the EBNF for EBNF includes this production:

syntactic exception
= ? a syntactic-factor that could be replaced 
    by a syntactic-factor containing no
    meta-identifiers
  ? ;

Similarly, you could write something like:

value = 'A' | 'B' | 'C';
list = value, {',', value};
set = ? a "list" where all "value"s are unique ? ;

or perhaps

set = ? a "list" with exactly three values
        where all "value"s are unique
      ? ;

That's not much use for building a parser generator, but it might be helpful for a human reader who understands the same language as you do.

For real parser generators, a common strategy is to allow any list, which might include repeated elements, and then reject invalid lists during semantic analysis. (Or even earlier. It's not a difficult analysis.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top