Question

I'm getting AttributeError: 'NoneType' object has no attribute 'encode' error when parsing out some XML patent inventor data. I'm trying to pull the first inventor plus their address infomation into a string as such below:

inventor1 = first(doc.xpath('//applicants/applicant/addressbook/last-name/text()'))
inventor2 = first(doc.xpath('//applicants/applicant/addressbook/first-name/text()'))
inventor3 = first(doc.xpath('//applicants/applicant/addressbook/address/city/text()'))
inventor4 = first(doc.xpath('//applicants/applicant/addressbook/address/state/text()'))
inventor5 = first(doc.xpath('//applicants/applicant/addressbook/address/country/text()'))
inventor = str(inventor2.encode("UTF-8")) + " " + str(inventor1.encode("UTF-8"))
inventors2 = str(inventor3.encode("UTF-8")) + ", " + str(inventor4) + ", " + str(inventor5)
inventors = str(inventor) + ", " + str(inventors2)

print "DocID: {0}\nGrantDate: {1}\nApplicationDate: {2}\nNumber of Claims: {3}\nExaminers: {4}\nAssignee: {5}\nInventor: {6}\n".format(docID,grantdate,applicationdate,claimsNum,examiners.encode("UTF-8"),assignees,inventors)

but there is problem as there is a UnicodeEncodeError: 'ascii' codec can't encode character for multiple parts in this long xml. I need to have the .encodes within my python so I don't create an error but by doing so I get this:

Traceback (most recent call last):
  File "C:\Documents and Settings\Desktop\FINAL BART INFO ONE.py", line 87, in <module> inventor = str(inventor2.encode("UTF-8")) + " " + str(inventor1.encode("UTF-8"))
AttributeError: 'NoneType' object has no attribute 'encode'

Is there anyway to either ignore the "None" values that are returned when nothing is there? Must I def or use a different type of .encode for my print?

By the way i'm creating a database from an input file is actually multiple XML files appending to one file. (Data file Sourced from Google Patents).

Was it helpful?

Solution

You could always just do quick and dirty str(inventor1.encode("UTF-8") if inventor1 else inventor1)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top