سؤال

This code I got somewhere on internet and I edited it.

How can I load the XML file from my directory? Is there any way to do this?

from elementtree import ElementTree as et
# Load the xml content from a string
content = et.fromstring("C:\DATA\US_Patent_Data\2012\ipgb20120103_wk01\ipgb20120103.xml")


# Get the person or use the .findall method to get all
# people if there's more than person
applicant = content.find("applicant")
last_name = applicant.find("addressbook/last-name")
first_name = applicant.find("addressbook/first-name")

# Get the persons address
address = addressbook.find("address")
street = address.find("street")
city= address.find("city")
state =  address.find("state")
postcode = address.find("postcode")
country = address.find("country")

# Print output
print "sequence: " + applicant.attrib.get('sequence')
print "first name: " + first_name.text
print "last name: " + last_name.text
print "street: " + street.text
print "city: " + city.text
print "state: " + state.text
print "postcode: " + postcode.text
print "country: " + country.text

I ran the program this is what I got. I copied part of them...

  File "C:\Python27\lib\site-packages\elementtree\ElementTree.py", line 1292, in feed
self._parser.Parse(data, 0)

ExpatError: not well-formed (invalid token): line 1, column 2

هل كانت مفيدة؟

المحلول

fromstring function is for reading xml data from string.

For reading xml data from file you should use parse. See docs on parsing xml with elementtree.

import xml.etree.ElementTree as ET
tree = ET.parse("C:\DATA\US_Patent_Data\2012\ipgb20120103_wk01\ipgb20120103.xml")
root = tree.getroot()

UPD: Seems like your xml is not well-formed because it has multiple roots. Try adding a single root element:

with open('ipgb20120103.xml', 'r') as f:
    xml_string = "<root>%s</root>" % f.read()

root = ET.fromstring(xml_string)
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top