Question

I have a beginners question regarding the W3C specification (EBNF notation) of XPath expressions. The specification can be found at: http://www.w3.org/TR/xpath/. In particular I have a question about understanding the following expression:

(//attribute::name | //attribute::id)[starts-with(string(self::node()), "be") or starts-with(string(self::node()), "1")]

This appears to be a valid expression. I verified using http://www.freeformatter.com/xpath-tester.html with the following XML document:

<documentRoot>
<!-- Test data -->
<?xc value="2" ?>
<parent name="data" >
   <child id="1"  name="alpha" >Some Text</child>
   <child id="2"  name="beta" >
      <grandchild id="2.1"  name="beta-alpha" ></grandchild>
      <grandchild id="2.2"  name="beta-beta" ></grandchild>
   </child>
   <pet name="tigger"  type="cat" >
      <data>
         <birthday month="sept"  day="19" ></birthday>
         <food name="Acme Cat Food" ></food>
      </data>
   </pet>
   <pet name="Fido"  type="dog" >
      <description>
         Large dog!
      </description>
      <data>
         <birthday month="feb"  day="3" ></birthday>
         <food name="Acme Dog Food" ></food>
      </data>
   </pet>
   <rogue name="is this real?" >
      <data>
         Hates dogs!
      </data>
   </rogue>
   <child id="3"  name="gamma"  mark="yes" >
      <!-- A comment -->
      <description>
         Likes all animals - especially dogs!
      </description>
      <grandchild id="3.1"  name="gamma-alpha" >
         <![CDATA[ Some non-parsable character data ]]>
      </grandchild>
      <grandchild id="3.2"  name="gamma-beta" ></grandchild>
   </child>
</parent>
</documentRoot>

This gives me the following results:

Attribute='id="1"'
Attribute='name="beta"'
Attribute='name="beta-alpha"'
Attribute='name="beta-beta"'

It is not clear to me which sequence of EBNF productions would produce the above query.

Thanks for help.

Was it helpful?

Solution 2

I don't know how to correctly represent this but Expr >>> FilterExpr Predicate:

Expr > OrExpr > AndExpr > EqualityExpr > RelationalExpr > AdditiveExpr > MultiplicativeExpr > UnaryExpr > UnionExpr > PathExpr > FilterExpr > FilterExpr Predicate

gives you the 2 parts:

  • the filter (//attribute::name | //attribute::id)
  • and the predicate [starts-with(string(self::node()), "be") or starts-with(string(self::node()), "1")]

(//attribute::name | //attribute::id)

FilterExpr > PrimaryExpr > '(' Expr ')'
Expr > OrExpr > AndExpr > EqualityExpr > RelationalExpr > AdditiveExpr > MultiplicativeExpr > UnaryExpr > UnionExpr > UnionExpr '|' PathExpr

gives you //attribute::name and //attribute::id

//attribute::name and //attribute::id

PathExpr > LocationPath > AbsoluteLocationPath > AbbreviatedAbsoluteLocationPath > '//' RelativeLocationPath
RelativeLocationPath > Step > AxisSpecifier NodeTest Predicate*
    - AxisSpecifier > AxisName '::'
        - AxisName > 'attribute'
    - NodeTest > NameTest

NameTest being name and id

Predicate [starts-with(string(self::node()), "be") or starts-with(string(self::node()), "1")]

Predicate > '[' PredicateExpr ']' > Expr > OrExpr > OrExpr 'or' AndExpr
    - OrExpr > AndExpr
    - AndExpr > EqualityExpr > RelationalExpr > AdditiveExpr > MultiplicativeExpr > UnaryExpr > UnionExpr > PathExpr > FilterExpr > PrimaryExpr > FunctionCall > FunctionName '(' ( Argument ( ',' Argument )* )? ')'
        Argument > Expr

FunctionName being starts-with, first argument being another FunctionCall (string function), second argument being Literals (via PathExpr > FilterExpr > PrimaryExpr), "be" and "1".

Finally, self::node() comes from:

RelativeLocationPath > Step > AxisSpecifier NodeTest Predicate*
    - AxisSpecifier > AxisName '::'
        - AxisName > 'attribute'
    - NodeTest > NodeType '(' ')'

NodeType being 'node'

OTHER TIPS

Break-down:

(                        # group
  //attribute::name      #   the long form of //@name
  |                      #   union
  //attribute::id        #   the long form of //@id 
)                        # group end
[                        # predicate (think "where")
  starts-with(           #   returns true or false
    string(              #     returns a string
      self::node()       #        the long form of "."
    ),                   #     )
    "be"                 #     a string literal
  )                      #   )
  or                     #   logical operator
  starts-with(           #   ...idem
    string(              #
      self::node()       #
    ),                   #
    "1"                  #
  )                      #
]                        # end predicate

So the expression is a rather unnecessarily verbose version of

(//@name | //@id)[starts-with(., "be") or starts-with(., "1")]

selecting all attributes named "name" or "id" whose values begin with "be" or "1"

I'm not sure why you want the EBNF productions for this (homework, I presume), but understanding the expression itself might help you with it.

A few extra notes:

  • attribute:: designates the attribute axis.
  • Axes can precede any node test (the default axis always is child::).
  • The self:: axis is special, it contains only the node in question. The short form of self::node() is the dot (.). The implication is that if the node in question is a <foo> node, self::foo will match it, while self::bar will not.
  • // is the shorthand for /descendant-or-self::node()/
  • The string() function is redundant because starts-with() will convert its arguments to string implicitly anyway.
  • The union operator joins two node sets. Nodes that appear in both sets are not duplicated in the result.
  • Predicates are applied to each node in a node set, effectively filtering it.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top