Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP invalid character error

Tags:

dom

php

xml

I'm getting this error when running this code: Fatal error: Uncaught exception 'DOMException' with message 'Invalid Character Error' in test.php:29 Stack trace: #0 test.php(29): DOMDocument->createElement('1OhmStable', 'a') #1 {main} thrown in test.php on line 29

The nodes that from the original XML file do contain invalid characters, but as I am stripping the invalid characters away from the nodes, the nodes should be created. What type of encoding do I need to do on the original XML document? Do I need to decode the saveXML?

function __cleanData($c) 
{
    return preg_replace("/[^A-Za-z0-9]/", "",$c);
}
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('test.xml');    
$xml->formatOutput = true; 

$append = array();
foreach ($xml->getElementsByTagName('product') as $product ) 
    {
        foreach($product->getElementsByTagName('name') as $name ) 
        {

            $append[] = $name;
        }
                foreach ($append as $a)  
                {
                    $nodeName = __cleanData($a->textContent);

                        $element = $xml->createElement(htmlentities($nodeName) , 'a');
                }
        $product->removeChild($xml->getElementsByTagName('details')->item(0));
        $product->appendChild($element);
    }

$result = $xml->saveXML();
$file = "data.xml";
file_put_contents($file,$result);

This is what the original XML looks like:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<details>
  <detail>
    <name>1 Ohm Stable</name>
    <value>600 x 1</value>
  </detail>
 </details>
</product>
 </products>

The new document is supposed to look like this:

 <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
  <1 Ohm Stable>

  </1 Ohm Stable>

  </product>
 </products>
like image 847
Ryan Avatar asked Dec 15 '11 17:12

Ryan


3 Answers

Simply you can not use an element name start with number

1OhmStable  <-- rename this
_1OhmStable <-- this is fine

php parse xml - error: StartTag: invalid element name

A nice article :- http://www.xml.com/pub/a/2001/07/25/namingparts.html

A Name is a token beginning with a letter or one of a few punctuation characters, and continuing with letters, digits, hyphens, underscores, colons, or full stops, together known as name characters.

like image 154
ajreal Avatar answered Oct 13 '22 16:10

ajreal


You have not written where you get that error. In case it's after you cleaned the value, this is my guess:

preg_replace("/[^A-Za-z0-9]/", "",$c);

This replacement is not written for UTF-8 encoded strings (which are used by DOMDocument). You can make it UTF-8 compatible by using the u-modifier (PCRE8)­Docs:

preg_replace("/[^A-Za-z0-9]/u", "",$c);
                            ^

It's just a guess, I suggest you make it more precise in your question which part of your code triggers the error.

like image 40
hakre Avatar answered Oct 13 '22 17:10

hakre


Even if __cleandata() will remove all other characters than latin alphabets a-z and numbers, it doesn't necessarily guarantee that the result is a valid XML name. Your function can return strings that begin with a number but numbers are illegal name start characters in XML, they can only appear in a name after the first name character. Also spaces are forbidden in names, so that is another point where your expected XML output would fail.

like image 32
jasso Avatar answered Oct 13 '22 16:10

jasso