I'm trying to fix an XML file with thousands of lines that have the error:
Opening and ending tag mismatch error
I'm using right now simpleXML to parse this file, so before parse with this librarie I need to fix the XML file:
Right now I'm trying with this solution but it's not enough:
libxml_use_internal_errors(true);
$xml = @simplexml_load_file($temp_name);
$errors = libxml_get_errors();
foreach ($errors as $error) {
if (strpos($error->message, 'Opening and ending tag mismatch')!==false) {
$tag = trim(preg_replace('/Opening and ending tag mismatch: (.*) line.*/', '$1', $error->message));
$lines = file($temp_name, FILE_IGNORE_NEW_LINES);
$line = $error->line+1;
echo $line;
echo "<br>";
$lines[$line] = '</'.$tag.'>'.$lines[$line];
file_put_contents($temp_name, implode("\n", $lines));
}
}
Any idea?
First, if you've got corrupt data then fixing the program that generated it is usually more important than repairing the data.
If the only errors in the file are mismatched end tags, then presumably the repair strategy is to ignore what's in the end tag entirely, given that the name appearing in an XML end tag is redundant. You might find that an existing tool such as TagSoup or validator.nu handles this the way you want; or you might find that such a tool outputs XML which can be transformed into the form you want. That's a better prospect than writing your own parser for this non-XML grammar.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With