Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP convert XML to JSON

Tags:

json

php

xml

People also ask

How to convert XML string to JSON in PHP?

php class XmlToJson { public function Parse ($url) { $fileContents= file_get_contents($url); $fileContents = str_replace(array("\n", "\r", "\t"), '', $fileContents); $fileContents = trim(str_replace('"', "'", $fileContents)); $simpleXml = simplexml_load_string($fileContents); $json = json_encode($simpleXml); return $ ...

Can you convert XML to JSON?

To convert an XML document to JSON, follow these steps: Select the XML to JSON action from the Tools > JSON Tools menu. Choose or enter the Input URL of the XML document. Choose the path of the Output file that will contain the resulting JSON document.


Json & Array from XML in 3 lines:

$xml = simplexml_load_string($xml_string);
$json = json_encode($xml);
$array = json_decode($json,TRUE);

Sorry for answering an old post, but this article outlines an approach that is relatively short, concise and easy to maintain. I tested it myself and works pretty well.

http://lostechies.com/seanbiefeld/2011/10/21/simple-xml-to-json-with-php/

<?php   
class XmlToJson {
    public function Parse ($url) {
        $fileContents= file_get_contents($url);
        $fileContents = str_replace(array("\n", "\r", "\t"), '', $fileContents);
        $fileContents = trim(str_replace('"', "'", $fileContents));
        $simpleXml = simplexml_load_string($fileContents);
        $json = json_encode($simpleXml);

        return $json;
    }
}
?>

I figured it out. json_encode handles objects differently than strings. I cast the object to a string and it works now.

foreach($xml->children() as $state)
{
    $states[]= array('state' => (string)$state->name); 
}       
echo json_encode($states);

I guess I'm a bit late to the party but I have written a small function to accomplish this task. It also takes care of attributes, text content and even if multiple nodes with the same node-name are siblings.

Dislaimer: I'm not a PHP native, so please bear with simple mistakes.

function xml2js($xmlnode) {
    $root = (func_num_args() > 1 ? false : true);
    $jsnode = array();

    if (!$root) {
        if (count($xmlnode->attributes()) > 0){
            $jsnode["$"] = array();
            foreach($xmlnode->attributes() as $key => $value)
                $jsnode["$"][$key] = (string)$value;
        }

        $textcontent = trim((string)$xmlnode);
        if (count($textcontent) > 0)
            $jsnode["_"] = $textcontent;

        foreach ($xmlnode->children() as $childxmlnode) {
            $childname = $childxmlnode->getName();
            if (!array_key_exists($childname, $jsnode))
                $jsnode[$childname] = array();
            array_push($jsnode[$childname], xml2js($childxmlnode, true));
        }
        return $jsnode;
    } else {
        $nodename = $xmlnode->getName();
        $jsnode[$nodename] = array();
        array_push($jsnode[$nodename], xml2js($xmlnode, true));
        return json_encode($jsnode);
    }
}   

Usage example:

$xml = simplexml_load_file("myfile.xml");
echo xml2js($xml);

Example Input (myfile.xml):

<family name="Johnson">
    <child name="John" age="5">
        <toy status="old">Trooper</toy>
        <toy status="old">Ultrablock</toy>
        <toy status="new">Bike</toy>
    </child>
</family>

Example output:

{"family":[{"$":{"name":"Johnson"},"child":[{"$":{"name":"John","age":"5"},"toy":[{"$":{"status":"old"},"_":"Trooper"},{"$":{"status":"old"},"_":"Ultrablock"},{"$":{"status":"new"},"_":"Bike"}]}]}]}

Pretty printed:

{
    "family" : [{
            "$" : {
                "name" : "Johnson"
            },
            "child" : [{
                    "$" : {
                        "name" : "John",
                        "age" : "5"
                    },
                    "toy" : [{
                            "$" : {
                                "status" : "old"
                            },
                            "_" : "Trooper"
                        }, {
                            "$" : {
                                "status" : "old"
                            },
                            "_" : "Ultrablock"
                        }, {
                            "$" : {
                                "status" : "new"
                            },
                            "_" : "Bike"
                        }
                    ]
                }
            ]
        }
    ]
}

Quirks to keep in mind: Several tags with the same tagname can be siblings. Other solutions will most likely drop all but the last sibling. To avoid this each and every single node, even if it only has one child, is an array which hold an object for each instance of the tagname. (See multiple "" elements in example)

Even the root element, of which only one should exist in a valid XML document is stored as array with an object of the instance, just to have a consistent data structure.

To be able to distinguish between XML node content and XML attributes each objects attributes are stored in the "$" and the content in the "_" child.

Edit: I forgot to show the output for your example input data

{
    "states" : [{
            "state" : [{
                    "$" : {
                        "id" : "AL"
                    },
                    "name" : [{
                            "_" : "Alabama"
                        }
                    ]
                }, {
                    "$" : {
                        "id" : "AK"
                    },
                    "name" : [{
                            "_" : "Alaska"
                        }
                    ]
                }
            ]
        }
    ]
}

A common pitfall is to forget that json_encode() does not respect elements with a textvalue and attribute(s). It will choose one of those, meaning dataloss. The function below solves that problem. If one decides to go for the json_encode/decode way, the following function is advised.

function json_prepare_xml($domNode) {
  foreach($domNode->childNodes as $node) {
    if($node->hasChildNodes()) {
      json_prepare_xml($node);
    } else {
      if($domNode->hasAttributes() && strlen($domNode->nodeValue)){
         $domNode->setAttribute("nodeValue", $node->textContent);
         $node->nodeValue = "";
      }
    }
  }
}

$dom = new DOMDocument();
$dom->loadXML( file_get_contents($xmlfile) );
json_prepare_xml($dom);
$sxml = simplexml_load_string( $dom->saveXML() );
$json = json_decode( json_encode( $sxml ) );

by doing so, <foo bar="3">Lorem</foo> will not end up as {"foo":"Lorem"} in your JSON.


Try to use this

$xml = ... // Xml file data

// first approach
$Json = json_encode(simplexml_load_string($xml));

---------------- OR -----------------------

// second approach
$Json = json_encode(simplexml_load_string($xml, "SimpleXMLElement", LIBXML_NOCDATA));

echo $Json;

Or

You can use this library : https://github.com/rentpost/xml2array


This solution handles namespaces, attributes, and produces consistent result with repeating elements (always in array, even if there is only one occurrence). Inspired by ratfactor's sxiToArray().

/**
 * <root><a>5</a><b>6</b><b>8</b></root> -> {"root":[{"a":["5"],"b":["6","8"]}]}
 * <root a="5"><b>6</b><b>8</b></root> -> {"root":[{"a":"5","b":["6","8"]}]}
 * <root xmlns:wsp="http://schemas.xmlsoap.org/ws/2004/09/policy"><a>123</a><wsp:b>456</wsp:b></root> 
 *   -> {"root":[{"xmlns:wsp":"http://schemas.xmlsoap.org/ws/2004/09/policy","a":["123"],"wsp:b":["456"]}]}
 */
function domNodesToArray(array $tags, \DOMXPath $xpath)
{
    $tagNameToArr = [];
    foreach ($tags as $tag) {
        $tagData = [];
        $attrs = $tag->attributes ? iterator_to_array($tag->attributes) : [];
        $subTags = $tag->childNodes ? iterator_to_array($tag->childNodes) : [];
        foreach ($xpath->query('namespace::*', $tag) as $nsNode) {
            // the only way to get xmlns:*, see https://stackoverflow.com/a/2470433/2750743
            if ($tag->hasAttribute($nsNode->nodeName)) {
                $attrs[] = $nsNode;
            }
        }

        foreach ($attrs as $attr) {
            $tagData[$attr->nodeName] = $attr->nodeValue;
        }
        if (count($subTags) === 1 && $subTags[0] instanceof \DOMText) {
            $text = $subTags[0]->nodeValue;
        } elseif (count($subTags) === 0) {
            $text = '';
        } else {
            // ignore whitespace (and any other text if any) between nodes
            $isNotDomText = function($node){return !($node instanceof \DOMText);};
            $realNodes = array_filter($subTags, $isNotDomText);
            $subTagNameToArr = domNodesToArray($realNodes, $xpath);
            $tagData = array_merge($tagData, $subTagNameToArr);
            $text = null;
        }
        if (!is_null($text)) {
            if ($attrs) {
                if ($text) {
                    $tagData['_'] = $text;
                }
            } else {
                $tagData = $text;
            }
        }
        $keyName = $tag->nodeName;
        $tagNameToArr[$keyName][] = $tagData;
    }
    return $tagNameToArr;
}

function xmlToArr(string $xml)
{
    $doc = new \DOMDocument();
    $doc->loadXML($xml);
    $xpath = new \DOMXPath($doc);
    $tags = $doc->childNodes ? iterator_to_array($doc->childNodes) : [];
    return domNodesToArray($tags, $xpath);
}

Example:

php > print(json_encode(xmlToArr('<root a="5"><b>6</b></root>')));
{"root":[{"a":"5","b":["6"]}]}