I'm having a problem with HTML Purifier where it removes IDs on headline elements despite using configuration options to avoid such behavior.
Right now I'm using:
// set up HTML Purifier for user inputs
require_once 'htmlpurifier/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('Core.Encoding', 'UTF-8');
$config->set('HTML.Doctype', 'HTML 4.01 Transitional');
$config->set('Attr.EnableID', true);
$config->set('HTML.Trusted', true);
$purifier = new HTMLPurifier($config);
I then feed it a string like:
<h6 id="1843804297">This is a title</h6><h5 id="1979691494">This one too.</h5><h3 id="932393874">I think you see where this is going.</h3>
I have also tried creating whitelisted entries for headlines with IDs to no avail, and even directly manipulating the defaults stored in the $config object.
$config->def->defaults['Attr.EnableID'] = true;
The IDs are important because they are assigned by a PHP script, stored in MySQL, and later picked up by a JS navigation system. They need to be fed in from the user, because often they stay static for subsequent content updates.
I believe that's because numeric IDs are invalid in HTML4.
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
Try using different IDs or change the Doctype.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With