The issue is with printing Unicode to Windows console. Namely, the character '•' can't be represented in cp437
used by your console.
To reproduce the problem, try:
print u'\u2022'
You could set PYTHONIOENCODING
environment variable to instruct python to replace all unrepresentable characters with corresponding xml char references:
T:\> set PYTHONIOENCODING=cp437:xmlcharrefreplace
T:\> python your_script.py
Or encode the text to bytes before printing:
print u'\u2022'.encode('cp437', 'xmlcharrefreplace')
answer to your initial question
To print text of each <build_location/>
element:
import sys
from xml.etree import cElementTree as etree
input_file = sys.stdin # filename or file object
tree = etree.parse(input_file)
print('\n'.join(elem.text for elem in tree.iter('build_location')))
If input file is large; iterparse()
could be used:
import sys
from xml.etree import cElementTree as etree
input_file = sys.stdin
context = iter(etree.iterparse(input_file, events=('start', 'end')))
_, root = next(context) # get root element
for event, elem in context:
if event == 'end' and elem.tag == 'build_location':
print(elem.text)
root.clear() # free memory