I think you're confusing the behavior of text()
with that of converting an element node to a string value. The difference shows up in cases where <p>
has child elements with text, such as <acronym>
.
What text()
does is select individual text nodes. With no explicit namespace axis, it selects text nodes that are children of the context node. Inside p[...]
, the context node is a <p>
element. So text()
inside that predicate selects only nodes that are children (not grandchildren) of the context node. In your example, the <p>
element has three child nodes:
- text node
"My "
- element node named
acronym
- text node
" message"
Therefore, text()
(in the context of p
element) would return a nodeset of two text nodes, whose values are My
and message
.
When evaluated as a string, as in contains(text(), ...)
, a nodeset is treated as follows: the string value of the first node is returned, and the rest are ignored. So in your example, the nodeset returned by text()
evaluates to "My "
, that is, the content of the first child text node. Your XPath expression then is equivalent to p[contains("My ", "My error message")]
for the p
you're trying to match. That of course fails. The predicate is false, so no p
element is returned.
You wanted the string content of all descendant text nodes to be concatenated, but that's not what text()
does. To get that, evaluate the <p>
element itself as a string. E.g.
p[contains(., "...")]
.
means the context node, which inside p[...]
is the p
element. When you evaluate the p
element as a string, the result is the concatenation of all descendant text nodes.