Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How excel reads XML file?

Tags:

php

xml

excel

I have researched a lot to convert an xml file to 2d array in a same way excel does trying to make same algorithm as excel does when you open an xml file in excel.

<items>
    <item>
        <sku>abc 1</sku>
        <title>a book 1</title>
        <price>42 1</price>
        <attributes>
            <attribute>
                <name>Number of pages 1</name>
                <value>123 1</value>
            </attribute>
            <attribute>
                <name>Author 1</name>
                <value>Rob dude 1</value>
            </attribute>
        </attributes>
        <contributors>
            <contributor>John 1</contributor>
            <contributor>Ryan 1</contributor>
        </contributors>
        <isbn>12345</isbn>
    </item>
    <item>
        <sku>abc 2</sku>
        <title>a book 2</title>
        <price>42 2</price>
        <attributes>
            <attribute>
                <name>Number of pages 2</name>
                <value>123 2</value>
            </attribute>
            <attribute>
                <name>Author 2</name>
                <value>Rob dude 2</value>
            </attribute>
        </attributes>
        <contributors>
            <contributor>John 2</contributor>
            <contributor>Ryan 2</contributor>
        </contributors>
        <isbn>6789</isbn>
     </item>
</items>

I want it to convert it to to 2-dimensional array like if you open the same file in Excel it will show you like this

enter image description here


I want to convert to 2-dimensional array just like Excel does. So far I can extract the labels like Excel does

function getColNames($array) {
    $cols   = array();
    foreach($array as $key=>$val) {
        if(is_array($val)) {
            if($val['type']=='complete') {
                if(in_array($val['tag'], $cols)) {

                } else {
                    $cols[] = $val['tag'];
                }
            }
         }
    }
    return $cols;
}

$p = xml_parser_create();
xml_parse_into_struct($p, $simple, $vals, $index);
xml_parser_free($p);

Goal

I want to have it generate like this..

array (
    0 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'name'=>'Number of Pages 1',
        'value'=>'123 1',
        'isbn'=>12345
    ),
    1 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'name'=>'Author 1',
        'value'=>'Rob dude 1',
        'isbn'=>12345
    ),
    2 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'contributor'=>'John 1',
        'isbn'=>12345
    ),
    3 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'contributor'=>'Ryan 1',
        'isbn'=>12345
    ),
)

Sample 2 XML..

 <items>
    <item>
       <sku>abc 1</sku>
       <title>a book 1</title>
       <price>42 1</price>
       <attributes>
          <attribute>
              <name>Number of pages 1</name>
              <value>123 1</value>
          </attribute>
          <attribute>
              <name>Author 1</name>
              <value>Rob dude 1</value>
          </attribute>
       </attributes>
       <contributors>
          <contributor>John 1</contributor>
          <contributor>Ryan 1</contributor>
       </contributors>
       <isbns>
            <isbn>12345a</isbn>
            <isbn>12345b</isbn>
       </isbns>
    </item>
    <item>
       <sku>abc 2</sku>
       <title>a book 2</title>
       <price>42 2</price>
       <attributes>
          <attribute>
              <name>Number of pages 2</name>
              <value>123 2</value>
          </attribute>
          <attribute>
              <name>Author 2</name>
              <value>Rob dude 2</value>
          </attribute>
       </attributes>
       <contributors>
          <contributor>John 2</contributor>
          <contributor>Ryan 2</contributor>
       </contributors>
       <isbns>
            <isbn>6789a</isbn>
            <isbn>6789b</isbn>
       </isbns>
    </item>
    </items>

Sample 3 XML..

<items>
<item>
   <sku>abc 1</sku>
   <title>a book 1</title>
   <price>42 1</price>
   <attributes>
      <attribute>
          <name>Number of pages 1</name>
          <value>123 1</value>
      </attribute>
      <attribute>
          <name>Author 1</name>
          <value>Rob dude 1</value>
      </attribute>
   </attributes>
   <contributors>
      <contributor>John 1</contributor>
      <contributor>Ryan 1</contributor>
   </contributors>
   <isbns>
        <isbn>
            <name>isbn 1</name>
            <value>12345a</value>
        </isbn>
        <isbn>
            <name>isbn 2</name>
            <value>12345b</value>
        </isbn>
   </isbns>
</item>
<item>
   <sku>abc 2</sku>
   <title>a book 2</title>
   <price>42 2</price>
   <attributes>
      <attribute>
          <name>Number of pages 2</name>
          <value>123 2</value>
      </attribute>
      <attribute>
          <name>Author 2</name>
          <value>Rob dude 2</value>
      </attribute>
   </attributes>
   <contributors>
      <contributor>John 2</contributor>
      <contributor>Ryan 2</contributor>
   </contributors>
   <isbns>
        <isbn>
            <name>isbn 3</name>
            <value>6789a</value>
        </isbn>
        <isbn>
            <name>isbn 4</name>
            <value>6789b</value>
        </isbn>
   </isbns>
</item>
</items>
like image 995
Abdul Basit Avatar asked Sep 23 '14 10:09

Abdul Basit


People also ask

Can you convert an XML file to Excel?

If you already have an XML file (either downloaded on your system or a link to it on the web), you can easily convert it into data in an Excel file.

How do I get an XML file to open in Excel?

Right-click on the file, and choose: Open With: Microsoft Excel If Microsoft Excel doesn't show up in your list of choices, then select: Open With: Choose Program. Excel will be a choice in the list B. Use Excel to navigate to, and open the file. To do this, first open Excel, then in the menu, choose File: Open.

How do I read an XML file?

An XML file is an extensible markup language file, and it is used to structure data for storage and transport. In an XML file, there are both tags and text. The tags provide the structure to the data. The text in the file that you wish to store is surrounded by these tags, which adhere to specific syntax guidelines.


1 Answers

According to your vague question, what you call "Excel" it does the following in my own words: It takes each /items/item element as a row. From that in document order, the column-name is the tag-name of each leaf-element-nodes, if there is a duplicate name, the position is of the first one.

Then it creates one row per row but only if all child-elements are leaf elements. Otherwise, the row is taken as base for the rows out of that row and non-leaf-element containing elements are interpolated. E.g. if such an entry does have two times two additional leafs with the same name, those get interpolated into two rows. Their child values are then placed into the position of the columns with the name following the logic described in the first paragraph.

How deep this logic is followed is not clear from your question. So I keep it on that level only. Otherwise the interpolation would need to recurse deeper into the tree. For that, the algorithm as outlined might not be fitting any longer.

To build that in PHP, you can particularly benefit from XPath and the interpolation works wonders as a Generator.

function tree_to_rows(SimpleXMLElement $xml)
{
    $columns = [];

    foreach ($xml->xpath('/*/*[1]//*[not(*)]') as $leaf) {
        $columns[$leaf->getName()] = null;
    }

    yield array_keys($columns);

    $name = $xml->xpath('/*/*[1]')[0]->getName();

    foreach ($xml->$name as $source) {
        $rowModel       = array_combine(array_keys($columns), array_fill(0, count($columns), null));
        $interpolations = [];

        foreach ($source as $child) {
            if ($child->count()) {
                $interpolations[] = $child;
            } else {
                $rowModel[$child->getName()] = $child;
            }
        }

        if (!$interpolations) {
            yield array_values($rowModel);
            continue;
        }

        foreach ($interpolations as $interpolation) {
            foreach ($interpolation as $interpolationStep) {
                $row = $rowModel;
                foreach ($interpolationStep->xpath('(.|.//*)[not(*)]') as $leaf) {
                    $row[$leaf->getName()] = $leaf;
                }
                yield array_values($row);
            }
        }
    }
}

Using it then can be as straight forward as:

$xml  = simplexml_load_file('items.xml');
$rows = tree_to_rows($xml);
echo new TextTable($rows);

Giving the exemplary output:

+-----+--------+-----+-----------------+----------+-----------+-----+
|sku  |title   |price|name             |value     |contributor|isbn |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |Number of pages 1|123 1     |           |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |Author 1         |Rob dude 1|           |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |                 |          |John 1     |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |                 |          |Ryan 1     |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |Number of pages 2|123 2     |           |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |Author 2         |Rob dude 2|           |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |                 |          |John 2     |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |                 |          |Ryan 2     |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+

The TextTable is a slightly modified version from https://gist.github.com/hakre/5734770 allowing to operate on Generators - in case you're looking for that code.

like image 166
hakre Avatar answered Nov 01 '22 22:11

hakre