문제

I have some XML, a fragment of which looks like:

<osgb:departedMember>
<osgb:DepartedFeature fid='osgb4000000024942964'>
<osgb:boundedBy>
<gml:Box srsName='osgb:BNG'>
<gml:coordinates>188992.575,55981.029 188992.575,55981.029</gml:coordinates>
</gml:Box>
</osgb:boundedBy>
<osgb:theme>Road Network</osgb:theme>
<osgb:reasonForDeparture>Deleted</osgb:reasonForDeparture>
<osgb:deletionDate>2014-02-19</osgb:deletionDate>
</osgb:DepartedFeature>
</osgb:departedMember>

I am parsing it with:

departedmembers = doc_root.findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}departedMember')
for departedMember in departedMembers:
    findWhat='{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}DepartedFeature'
    fid = int(departedmember.find(findWhat).attrib['fid'].replace('osgb', ''))
    theme=departedmember[0].findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}theme')[0].text    
    reason=departedmember[0].findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}reasonForDeparture')[0].text
    date=departedmember[0].findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}deletionDate')[0].text

Occasionally either the reason or the date or both are empty, ie, the element is missing, not just has empty content. This is legitimate according to the XSD, but I get attribute errors trying to select the text of a non-existent element. To deal with that I have put the reason and date lines in try, except blocks, like:

try:
    date=departedmember[0].findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}deletionDate')[0].text
except:
    pass

This works, but I hate to use except/pass like this, so it led me to wondering if there is a nicer way to parse a document like this where some elements are optional.

도움이 되었습니까?

해결책

Since you are interested only in the first element of findall, you can replace findall(x)[0] with find(x). Besides, if you want to avoid try/except blocks, you can use ternary.

departedmembers = doc_root.findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}departedMember')
for departedMember in departedMembers:
    ...
    date = departedmember[0].find('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}deletionDate')
    date = None if date == None else date.text # Considering you want to set the element to None if it was not found

다른 팁

Yes, the issue is not the searching method, rather the referencing of the returning elements when there are none. You can write your code like this:

results = departedmember[0].findall('{http://www.ordnancesurvey.co.uk/xml/namespaces/osgb}deletionDate')

if results:
    date = results[0].text
else:
    # there is no element,
    # do what you want in this case
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top