Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly assemble a valid xlsx file from its internal sub-components?

Tags:

xml

zip

xlsx

I'm trying to create an xlsx file programmatically on iOS. Since the internal data of xlsx files is basically stored in separate xml files, I tried to recreate xlsx structure with all its files and subdirectories, compress them into a zip file and set its extension to xlsx. I use GDataXML parser/writer for creating all the necessary xml files. However, the file I get can't be opened as xlsx file. Even if I rip all the data from a valid xlsx file, create all the xml files manually by copying data from the original xml files and compress them manually, I can't recreate a valid xlsx file.

The questions are:

  • is xlsx really just an archive containing xml files?
  • how do I create a valid xlsx file programmatically if I can't just compress xml files into zip file and set its extension to xlsx?
like image 450
nick130586 Avatar asked Jun 18 '12 11:06

nick130586


People also ask

What is the format of an XLSX file?

XLSX is a zipped, XML-based file format. Microsoft Excel 2007 and later uses XLSX as the default file format when creating a new spreadsheet. Support for loading and saving legacy XLS files is also included. XLS is the default format used with Office 97-2003.

How does XLSX format work?

XLSX files are the standard extension for the modern Microsoft Excel spreadsheet files. They are a zip-compressed XML spreadsheet file used to analyze and organize data. They contain numerical data separated by rows and columns within a cell. It is a modern, updated version of the original Excel file format.

How are XLSX files stored?

XLSX files organize data in cells that are stored in worksheets, which are in turn stored in workbooks (files that contain multiple worksheets). The cells of a spreadsheet are positioned by rows and columns and can contain styles, formatting, math functions, and more.


1 Answers

In answer to your questions:

  1. XLSX is just a collection of XML files in a zip container. There is no other magic.
  2. If you decompress/unzip a valid XLSX files and then recompress/zip it and you can't read the resulting output then the problem is generally with the files being rezipped or, less likely, the zipping software. The main thing to check is that the directory structure was maintained in the zip file.

Example of the contents of an xlsx file:

unzip -l example.xlsx Archive:  example.xlsx   Length     Date   Time    Name  --------    ----   ----    ----       769  10-15-14 09:23   xl/worksheets/sheet1.xml       550  10-15-14 09:22   xl/workbook.xml       201  10-15-14 09:22   xl/sharedStrings.xml       ... 

I regularly unzip XLSX files, make minor changes for testing and re-zip them without any issue.

Update: The important thing is to avoid zipping the parent directory. Here is an example using the zip system utility on Linux or the OS X:

# Unzip an xlsx file into a directory. unzip example.xlsx -d newdir  # Make some valid changes to the files. cd newdir/ vi xl/worksheets/sheet1.xml  # Rezip the files *FROM* the unzipped directory. # Note: you could also re-zip to the original file if required. find . -type f | xargs zip ../newfile.xlsx  # Check the file looks okay. cd .. unzip -l newfile.xlsx xdg-open newfile.xlsx 
like image 121
jmcnamara Avatar answered Sep 21 '22 12:09

jmcnamara