I want to treat data from .tcx file (xml form) between specific tags with Python.
File format is like as follows.
<Track>
<Trackpoint>
<Time>2015-08-29T22:04:39.000Z</Time>
<Position>
<LatitudeDegrees>37.198049426078796</LatitudeDegrees>
<LongitudeDegrees>127.07204628735781</LongitudeDegrees>
</Position>
<AltitudeMeters>34.79999923706055</AltitudeMeters>
<DistanceMeters>7.309999942779541</DistanceMeters>
<HeartRateBpm>
<Value>102</Value>
</HeartRateBpm>
<Cadence>76</Cadence>
<Extensions>
<TPX xmlns="http://www.garmin.com/xmlschemas/ActivityExtension/v2">
<Watts>112</Watts>
</TPX>
</Extensions>
</Trackpoint>
....Lots of <Trackpoint> ... </Trackpoint>
</Track>
Eventually, I'll make Data table with columns of 'Lattitude, Altitude, ... Watts'.
First I tried to make a list from taged data (like Watts ... /Watts) with BeautifulSoup, xpath etc.
But I'm a newbie to deal with these tools.
How can I grab data between tags in xml file with Python?
For example, you can upload a TCX file to Strava (Web) or Garmin Connect (Web) to view the workout data it contains. You can also open a TCX file in the desktop versions of Google Earth (cross-platform), though Google Earth will show you only the running or biking route a TCX file contains.
You can rename the TCX file extension to . zip then expand the file with a Zip decompression program, such Apple Archive Utility. I've been able to open TCX files in my computer using Microsoft Word Viewer or Microsoft Excel Viewer. If you have any other question, please let us know.
TCX files contain more information than GPX files such as heart rate, cadence, and watts. TCX files exported from Strava will also contain power data. TCX export only works for activities with GPS data, which may exclude indoor activities with no GPS.
A TCX (Training Center XML) file is a data exchange format used to share data between fitness devices. It was introduced in 2008 with Garmin's Training Center product. Workout data such as heart rate, running cadence, bicycle cadence, calories, and lap time is stored in XML format inside the TCX file.
You could use the lxml
module, along with XPath
. lxml
is good for parsing XML/HTML, traversing element trees and returning element text/attributes. You can select particular elements, sets of elements or attributes of elements using XPath
. Using your example data:
content = '''
<Track>
<Trackpoint>
<Time>2015-08-29T22:04:39.000Z</Time>
<Position>
<LatitudeDegrees>37.198049426078796</LatitudeDegrees>
<LongitudeDegrees>127.07204628735781</LongitudeDegrees>
</Position>
<AltitudeMeters>34.79999923706055</AltitudeMeters>
<DistanceMeters>7.309999942779541</DistanceMeters>
<HeartRateBpm>
<Value>102</Value>
</HeartRateBpm>
<Cadence>76</Cadence>
<Extensions>
<TPX xmlns="http://www.garmin.com/xmlschemas/ActivityExtension/v2">
<Watts>112</Watts>
</TPX>
</Extensions>
</Trackpoint>
....Lots of <Trackpoint> ... </Trackpoint>
</Track>
'''
from lxml import etree
tree = etree.XML(content)
time = tree.xpath('Trackpoint/Time/text()')
print(time)
Output
['2015-08-29T22:04:39.000Z']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With