Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DOMDocument: Ignore Duplicate Element IDs

I'm putting some page content (which has been run through Tidy, but doesn't need to be if this is a source of problems) into DOMDocument using DOMDocument::loadHTML.

It's coming up with various errors:

ID x already defined in Entity, line X

Is there any way to make either DOMDocument (or Tidy) ignore or strip out duplicate element IDs, so it will actually create the DOMDocument?

Thanks. :)

like image 695
James Inman Avatar asked Jan 06 '09 09:01

James Inman


1 Answers

A quick search on the subject reveals this (incorrect) bug report:

http://bugs.php.net/bug.php?id=46136

The last reply states the following:

You're using HTML 4 rules to load an XHTML document. Either use the load() method to parse as XML or the libxml_use_internal_errors() function to ignore the warnings.

I can't be sure if you are encountering this problem for the same reasons, since you did not include a reference to the HTML page being loaded. In any case, using libxml_use_internal_errors() should at least suppress the error.

ID's in HTML documents are generally unique, so the best solution would still be validating your document, if at all possible.

like image 87
Aron Rotteveel Avatar answered Sep 28 '22 08:09

Aron Rotteveel