Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Location of query definition in XLSX file

After creating an xlsx file with the following custom query (loading data from the jsonplaceholder service from typicode.com)

let
    Source = Json.Document(Web.Contents("https://jsonplaceholder.typicode.com/posts")),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"userId", "id", "title", "body"}, {"userId", "id", "title", "body"})
in
    #"Expanded Column1"

and unzipping it I would expect the above string (the query definition) to be found somewhere in the resulting folder structure in some form. The goal is to programatically replace the URL, but somehow the only query definition I can find is

<connection id="1" keepAlive="1" name="Query - posts" description="Connection to the 'posts' query in the workbook." type="5" refreshedVersion="6" background="1" saveData="1">
    <dbPr connection="Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=posts;Extended Properties=&quot;&quot;" command="SELECT * FROM [posts]"/>
</connection>

Although I wouldn't recommend anybody to download & run office files from strangers, I did upload the xlsx to nofile.io.

I would expect some formula property on the queryTable, but the query table definition just looks like

<queryTable xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" name="ExternalData_1" connectionId="1" autoFormatId="16" applyNumberFormats="0" applyBorderFormats="0" applyFontFormats="0" applyPatternFormats="0" applyAlignmentFormats="0" applyWidthHeightFormats="0">
   <queryTableRefresh nextId="5">
      <queryTableFields count="4">
         <queryTableField id="1" name="userId" tableColumnId="5" />
         <queryTableField id="2" name="id" tableColumnId="2" />
         <queryTableField id="3" name="title" tableColumnId="3" />
         <queryTableField id="4" name="body" tableColumnId="4" />
      </queryTableFields>
   </queryTableRefresh>
</queryTable>

Ideally I would like just the information about how the spec stores this information (as I want to manually edit this on the frontend in the browser), although any solution which allows me to generate this would be perfect.

like image 700
David Mulder Avatar asked Oct 29 '22 22:10

David Mulder


1 Answers

Although this is an old post, I'm adding references to the Open Specifications that detail exactly how the query information can be found in Excel (.xlsx) Office Open XML file format. As was already noted, the customXml\item[x].xml parts will contain DataMashup format information. The specifications for those packaging and part data can be found in the following two links:

[MS-QDEFF]: Query Definition File Format

[MS-QDEIF]: Query Definition Interoperability Format

One thing to note is that the top level binary stream (after base64 decoding the value from the DataMashup element) has a 4 byte version field at the beginning (which is 0) and a 4 byte size field before each of the Package, Permissions and Metadata pieces. Once each piece is extracted, it can be used as an independent PK Zip package (actually an OPC - Open Packaging Convention).

like image 153
Tom Jebo - Microsoft Avatar answered Nov 09 '22 12:11

Tom Jebo - Microsoft