Parsing XML con Etree Python

https://stackoverflow.com//questions/22007185

21-12-2019
|

Domanda

per questo XML

<locations>

    <location>
        <locationid>1</locationid>
        <homeID>281</homeID>
        <buildingType>Added</buildingType>
        <address>A</address>
        <address2>This is address2</address2>
        <city>This is city/city>
        <state>State here</state>
        <zip>1234</zip>
    </location>
    <location>
        <locationid>2</locationid>
        <homeID>81</homeID>
        <buildingType>Added</buildingType>
        <address>B</address>
        <address2>This is address2</address2>
        <city>This is city/city>
        <state>State here</state>
        <zip>1234</zip>
    </location>
    .
    .
    .
    .
    <location>
        <locationid>10</locationid>
        <homeID>21</homeID>
        <buildingType>Added</buildingType>
        <address>Z</address>
        <address2>This is address2</address2>
        <city>This is city/city>
        <state>State here</state>
        <zip>1234</zip>
    </location>
</locations>

Come posso ottenere locationID per l'indirizzo A, utilizzando etree.

Ecco il mio codice,

import urllib2
import lxml.etree as ET

url="url for the xml"
xmldata = urllib2.urlopen(url).read()
# print xmldata
root = ET.fromstring(xmldata)
for target in root.xpath('.//location/address[text()="A"]'):
    print target.find('LocationID')

Ottenere la produzione come None, cosa c'è di sbagliato che sto facendo qui?

Soluzione

Prima di tutto, il tuo xml non è ben formato.Dovresti prendere più cura quando la pubblichi e cerca di evitare altri utenti per risolvere i tuoi dati.

Puoi cercare il fratello precedente, come:

import urllib2
import lxml.etree as ET

url="..."
xmldata = urllib2.urlopen(url).read()
root = ET.fromstring(xmldata)
for target in root.xpath('.//location/address[text()="A"]'):                                                                                                  
    for location in [e for e in target.itersiblings(preceding=True) if e.tag == "locationid"]:                                                                
        print location.text

o farlo direttamente dall'espressione xpath, come:

import urllib2
import lxml.etree as ET

url="..."
xmldata = urllib2.urlopen(url).read()
root = ET.fromstring(xmldata)
print root.xpath('.//location/address[text()="A"]/preceding-sibling::locationid/text()')[0]

Esegui uno di loro come:

python2 script.py

che la resa:

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow