Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove white spaces between tag values in xml with php

I been searching information how to remove white spaces between tag values leaved by a PHP code when I export it to XML, I will explain detailed, first I load and XML then I do a search on the file with xPath, then I remove some elements that do not match some brands and finally I reexport it as a new XML, the problem is that this new XML is full of white spaces leaved by the code. I tried trim it but it doesn't seems to work correctly.

Here is my code:

<?php
$sXML = simplexml_load_file('file.xml'); //First load the XML
$brands = $sXML->xPath('//brand'); //I do a search for the <brand> tag

function filter(string $input) { //Then I give it a list of variables
    switch ($input) {
        case 'BRAND 3':
        case 'BRAND 4':
            return false;
        default:
            return true;
    }
}

array_walk($brands, function($brand) { //I remove all elements do not match my list
    $content = (string) $brand;
    if (filter($content)) {
        $item = $brand->xPath('..')[0];
        unset($item[0]);
    }
});

$sXML->asXML('filtred.xml'); // And finally export a new xml

?>

This one is the original XML:

<?xml version="1.0" encoding="utf-8"?>
<products>
  <item>
    <reference>00001</reference>
    <other_string>PRODUCT 1</other_string>
    <brand>BRAND 1</brand>
  </item>
  <item>
    <reference>00002</reference>
    <other_string>PRODUCT 2</other_string>
    <brand>BRAND 2</brand>
  </item>
  <item>
    <reference>00003</reference>
    <other_string>PRODUCT 3</other_string>
    <brand>BRAND 3</brand>
  </item>
  <item>
    <reference>00004</reference>
    <other_string>PRODUCT 4</other_string>
    <brand>BRAND 4</brand>
  </item>
  <item>
    <reference>00005</reference>
    <other_string>PRODUCT 5</other_string>
    <brand>BRAND 5</brand>
  </item>
</products>

And the output of the script sends this:

<?xml version="1.0" encoding="utf-8"?>
<products>
  <item>
    <reference>00001</reference>
    <other_string>PRODUCT 1</other_string>
    <brand>BRAND 1</brand>
  </item>
  <item>
    <reference>00002</reference>
    <other_string>PRODUCT 2</other_string>
    <brand>BRAND 2</brand>
  </item>


  <item>
    <reference>00005</reference>
    <other_string>PRODUCT 5</other_string>
    <brand>BRAND 5</brand>
  </item>
</products>

As you can see on the output, there is a white space between product 2 and product 5 and that I need to remove it. Any help will be appreciate.

like image 430
Fernando Olvera Avatar asked Mar 20 '19 03:03

Fernando Olvera


2 Answers

You can force SimpleXML to trim all whitespace when it reads the file, by passing the LIBXML_NOBLANKS option to simplexml_load_file:

$sXML = simplexml_load_file('file.xml', null, LIBXML_NOBLANKS);

Then when you call ->asXML(), all the whitespace will be removed, and you'll get XML all on one line, like this:

<?xml version="1.0" encoding="utf-8"?>
<products><item><reference>00003</reference><other_string>PRODUCT 3</other_string><brand>BRAND 3</brand></item><item><reference>00004</reference><other_string>PRODUCT 4</other_string><brand>BRAND 4</brand></item></products>

To re-generate whitespace based on the remaining structure, you'll need to use DOM rather than SimpleXML - but that's easy to do without changing any of your existing code, because dom_import_simplexml simply "rewraps" the XML without reparsing it.

Then you can use the DOMDocument formatOutput property and save() method to "pretty-print" the document:

$sXML = simplexml_load_file('file.xml', null, LIBXML_NOBLANKS);
// ...
// process $sXML as before
// ...
$domDocument = dom_import_simplexml($sXML)->ownerDocument;
$domDocument->formatOutput = true;
echo $domDocument->save('filtered.xml');
like image 136
IMSoP Avatar answered Sep 22 '22 14:09

IMSoP


Another possibility is to use preg_replace:

// Get simpleXml as string
$xmlAsString = $yourSimpleXmlObject->asXML();

// Remove newlines
$xmlAsString = preg_replace("/\n/", "", $xmlAsString);

// Remove spaces between tags
$xmlAsString = preg_replace("/>\s*</", "><", $xmlAsString);

var_dump($xmlAsString);

Now you get your XML as string in one line (including the XML declaration).

like image 34
Mike Avatar answered Sep 23 '22 14:09

Mike