This is a namespaced XML document. Therefore you need to address the nodes using their respective namespaces.
The namespaces used in the document are defined at the top:
xmlns:tc2="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:tp1="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
xmlns="http://www.topografix.com/GPX/1/1"
So the first namespace is mapped to the short form tc2
, and would be used in an element like <tc2:foobar/>
. The last one, which doesn't have a short form after the xmlns
, is called the default namespace, and it applies to all elements in the document that don't explicitely use a namespace - so it applies to your <trkpt />
elements as well.
Therefore you would need to write root.iter('{http://www.topografix.com/GPX/1/1}trkpt')
to select these elements.
In order to also get time and elevation, you can use trkpt.find()
to access these elements below the trkpt
node, and then element.text
to retrieve those elements' text content (as opposed to attributes like lat
and lon
). Also, because the time
and ele
elements also use the default namespace you'll have to use the {namespace}element
syntax again to select those nodes.
So you could use something like this:
NS = 'http://www.topografix.com/GPX/1/1'
header = ('lat', 'lon', 'ele', 'time')
with open('output.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(header)
root = lxml.etree.fromstring(x)
for trkpt in root.iter('{%s}trkpt' % NS):
lat = trkpt.get('lat')
lon = trkpt.get('lon')
ele = trkpt.find('{%s}ele' % NS).text
time = trkpt.find('{%s}time' % NS).text
row = lat, lon, ele, time
writer.writerow(row)
For more information on XML namespaces, see the Namespaces section in the lxml tutorial and the Wikipedia article on XML Namespaces. Also see GPS eXchange Format for some details on the .gpx
format.