I've inherited a spreadsheet module that reads *.xlsx files with XML functions. The app maps data using sheet names and that part of the module is pretty broken.
I don't have a month to find and read the specs of the Office Open XML format so I've composed a quick hack after a quick look to some samples files:
Open xl/workbook.xml and loop /workbook/sheets:
name.r:id.Open xl/_rels/workbook.xml.rels and loop /Relationships filtering by Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet:
Id.Target.Is this algorithm correct?
This is an older post and OP has likely moved on, but in case any one else is interested heres what I found.
I came across this problem as well while writing a simple parser to do some XLSX massaging and couldn't find the documentation to indicate the exact way the XML file names map to the sheets defined in workbook.xml.
Here's what I discovered:
There seems to be three ways to determine this; I've illustrated 2 below and OP's would be the third.
First, given from what I've seen OP's solution should indeed work and is probably the correct way to do this as the relationship between reference IDs and files are maintained in the .rels files. I took a different approach and, in hindsight, I should have used OPs solution but I didn't known exactly how the relationships worked at the time.
Given this excerpt from a xl/workbook.xml file which is a fragment of an XLSX file generated from Excel where I:
<sheets>
<sheet name="Third Sheet" sheetId="3" r:id="rId1"/>
<sheet name="First Sheet" sheetId="1" r:id="rId2"/>
</sheets>
In the extracted XML files the XLSX sheet mapping are listed below based on the content I originally added to the sheets:
xl/worksheets/sheet1.xml ---> sheetId="3" Third Sheet r:id="rid1"
xl/worksheets/sheet2.xml ---> sheetId="1" First Sheet r:id="rid2"
So from that it looks to be that the sheet names can map as such:
The positional index + 1 of the sheet element under the sheets element in the xl/workbook.xml file matches the trailing ID on the sheet filename.
The trailing number of the id attribute (rid1 and rid3) matches the trailing ID on the sheet filename.
PSA:
Don't be like me and initially assume the sheetID attribute maps to the sheet filename, that is incorrect. Also, use OP's solution as it is more correct and relies on hard references rather than mine which infers the references.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With