The test url
is preventing Nokogiri's Xpath from catching on to your namespacing within s:Body
. Try simply
email = xml.xpath("//s:Body").first.to_xml.scan(/<EmailAddress>([^<]+)/)[0][0]
Question
I am trying to parse the xml below to get the email address out. I can get the messageid but I think having the a: in front is enabling me to use xpath. Not sure how to pull out the email address. I am trying
xml.xpath("//s:Body/Discover/request/EmailAddress").children.text.to_s
and
xml.xpath("//s:Body/Discover/EmailAddress").children.text.to_s
if i do xml.xpath("//s:Body").children.text.to_s i get the email and the version with all the newlines and tabs but i do not want to parse the email out if i do not have to.
<s:Envelope xmlns:a="http://www.w3.org/2005/08/addressing" xmlns:s="http://www.w3.org/2003/05/soap-envelope">
<s:Header>
<a:Action s:mustUnderstand="1">test url</a:Action>
<a:MessageID>mid</a:MessageID>
<a:ReplyTo>
<a:Address>test url</a:Address>
</a:ReplyTo>
<a:To s:mustUnderstand="1">test url</a:To>
</s:Header>
<s:Body>
<Discover xmlns="test url">
<request xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<EmailAddress>bob@xml.com</EmailAddress>
<RequestVersion>1.0</RequestVersion>
</request>
</Discover>
</s:Body>
</s:Envelope>
Solution
The test url
is preventing Nokogiri's Xpath from catching on to your namespacing within s:Body
. Try simply
email = xml.xpath("//s:Body").first.to_xml.scan(/<EmailAddress>([^<]+)/)[0][0]
OTHER TIPS
The Discover
element (and its children) are in a different namespace, and you need to specify this in your query. The second argument to the xpath
method is a hash where you can associate prefixes used in the query with namespace urls. Have a look at the section on namespaces in the Nokogiri tutorial.
With Nokogiri, if you don’t specify a namespace hash it will automatically register any namespaces defined on the root node for you. In this case that is the a
prefix for http://www.w3.org/2005/08/addressing
and the s
prefix for http://www.w3.org/2003/05/soap-envelope
. This is why your query for //s:Body
works. The namespace declaration for Discover
isn’t on the root, so you have to register it yourself.
When you provide your own namespace hash Nokogiri doesn’t add those defined on the root, so you will also need to include any of those used in your query.
In your case the following will find the EmailAddress
node. The actual prefix you used doesn’t matter (here I’ve chosen t
) as long as the URI matches).
xml.xpath('//s:Body/t:Discover/t:request/t:EmailAddress',
's' => "http://www.w3.org/2003/05/soap-envelope",
't' => "test url")