Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP DOMDocument not formatting output correctly

I'm currently working on the sitemaps for a website, and I'm using SimpleXML to import and do some checks on the original XML file. after this I use simplexml_load_file("small.xml"); to convert it to DOMDocument to make it easier to precisely add and manipulate XML elements. Below is the test XML sitemap that i'm working from:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:52:32-Orouke.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:23-castle technology.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:38-banana split.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:42-Waveney.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:55:12-pure orange.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:57:54-tau press.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:21-E.f.m.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:31-apple.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:45-townhouse communications.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
</urlset>

Now. here is the test code I'm using to modify:

<?php

$root = simplexml_load_file("small.xml");

$domRoot = dom_import_simplexml($root);

$dom = $domRoot->ownerDocument;

$urlElement = $dom->createElement("url");

    $locElement = $dom->createElement("loc");

        $locElement->appendChild($dom->createTextNode("www.google.co.uk"));

    $urlElement->appendChild($locElement);

    $lastmodElement = $dom->createElement("lastmod");

        $lastmodElement->appendChild($dom->createTextNode("2011-08-02"));

    $urlElement->appendChild($lastmodElement);

$domRoot->appendChild($urlElement);

$dom->formatOutput = true;
echo $dom->saveXML();

?>

The main problem is, that no matter where i place $dom->formatOutput = true; the existing XML that was imported from SimpleXML is formatted correctly, but anything new is formatted in the "all one line" style, as follows:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:52:32-Orouke.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:23-castle technology.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:38-banana split.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:53:42-Waveney.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:55:12-pure orange.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:57:54-tau press.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:21-E.f.m.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:31-apple.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
  <url>
    <loc>http://www.companycheck.co.uk/searches/2011/08/22/23:59:45-townhouse communications.html</loc>
    <lastmod>2011-08-23</lastmod>
  </url>
<url><loc>www.google.co.uk</loc><lastmod>2011-08-02</lastmod></url></urlset>

If anyone has an idea why this is happening and how to fix it I would be very grateful.

like image 577
Tom Busby Avatar asked Jul 02 '26 03:07

Tom Busby


1 Answers

There is a workaround. You can force reformatting by saving your new xml to string first, then load it again after setting the formatOutput property, e.g.:

$strXml = $dom->saveXML();
$dom->formatOutput = true;
$dom->loadXML($strXml);
echo $dom->saveXML();
like image 77
Simon Avatar answered Jul 03 '26 15:07

Simon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!