Find Node value in xml using etree

https://stackoverflow.com/questions/22507281

17-06-2023
|

Question

Here is xml ,

<organizations>
    <organization>
        <orgID>152</orgID>
        <orgName>This is A</orgName>
    </organization>
<organization>
        <orgID>1352</orgID>
        <orgName>This is B</orgName>
    </organization>
    <organization>
        <orgID>1522</orgID>
        <orgName>This is C</orgName>
    </organization>
    <organization>
        <orgID>1512</orgID>
        <orgName>This is D</orgName>
    </organization>
</organizations>

What i want is orgName using orgID

I tried ,

import urllib
import lxml.etree as ET
url='url here'
xmldata = urllib.urlopen(url).read()
root = ET.fromstring(xmldata)
for target in root.xpath('.//organization/orgID[text()="152"]'):
    print target

But nothing prints.

What i am wrong doing here ?

Solution

One option is to check for descendant's text:

from lxml import etree as ET


data = """<organizations>
    <organization>
        <orgID>152</orgID>
        <orgName>This is A</orgName>
    </organization>
<organization>
        <orgID>1352</orgID>
        <orgName>This is B</orgName>
    </organization>
    <organization>
        <orgID>1522</orgID>
        <orgName>This is C</orgName>
    </organization>
    <organization>
        <orgID>1512</orgID>
        <orgName>This is D</orgName>
    </organization>
</organizations>"""

tree = ET.fromstring(data)
print tree.xpath('//organization[descendant::text()="1512"]/orgName/text()')

prints:

['This is D']

OTHER TIPS

If I use the content provided in the question as xmldata, it print something like following:

<Element orgID at 0x2858c18>

Maybe you should check that the url really give you the content.

BTw, if you want to print the text of orgName, change the for statement as follow:

for target in root.xpath('.//organization/orgID[text()="152"]/following-sibling::orgName/text()'):
    print target

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow