Question

Take a look at the definition below. What exactly is this supposed to define? According to the EBNF specification, brackets [] define an optional item, so why is the * required? Isn't that superfluous (since it means a repetition of zero or more times)?

The second thing is, how do you interpret the part within parentheses? The - is the exclusion indicator, so does it mean excluding any of the items within parentheses, or the sequence of all three (zero or more from ^<&, followed by ]]>, followed by zero or more from ^<&)?

CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)

Or am I completely mistaken, and this is something other than EBNF?

Thanks in advance

Was it helpful?

Solution

The XML specification does not strictly use EBNF as specified by ISO. If you look at Section 6 of the XML specification, it defines the notation used. Square brackets are used in a regex-like manner, not to denote an optional element of the grammar; and the - used for exclusion excludes the group within the parentheses as a whole. Thus, the line you quoted denotes builds up as follows:

  • [^<&] - any character that is not a left angle bracket (<) or an ampersand (&)
  • [^<&]* - zero or more characters that are not left angle brackets or ampersands
  • [^<&]* - ([^<&]* ']]>' [^<&]*) - zero or more characters that are not left angle brackets or ampersands and which do not contain the particular sequence of characters ]]> anywhere within the overall sequence
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top