Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tidy breaks anchor hrefs

i have the following HTML code ( i can't change it , its from external source )

<a href="http://linkhref.com"><center>Link Text</center></a>

but after processing with tidy , the HTML is broken ( anchor is not clickable , because the innerHTML is appended after the tag :( I am using following configuration options :

$config = array(
            'output-xml'=>true,
            'wrap'=>false,
            'doctype'=>'omit',
            'quote-nbsp'=>false,
            'quiet'=>true,
            'bare'=>true,
            'fix-backslash'=>false,
            'indent-cdata'=>false
    );

the tidy will output :

<html> 
<head> 
<title></title> 
</head> 
<body> 
<a href="http://linkhref.com"></a> 
<center>Link Text</center> 
<br /> 
</body> 
</html>

Any suggestions ? Thanks a lot.

like image 749
Pavel Perna Avatar asked Mar 27 '26 16:03

Pavel Perna


2 Answers

Ideally you don't want the "center" tag to remain anyway - it's a deprecated tag in HTML4, and is no longer supported in HTML5.

Try adding the configuration option:

'clean' => true
like image 140
Rob Baillie Avatar answered Mar 30 '26 07:03

Rob Baillie


I had similar issue with A tags, that are now in HTML 5 allowed as block element.

I solved it like this:

static function tidy_links_cb($m)
{
    $blocks = 'div|ul|li|dl|form|fieldset|mena|nav|table|tr|td|th|address|article|aside|blockquote|dir|div|dl|fieldset|footer|form|h1|h2|h3|h4|h5|h6|header|hr|menu|nav|ol|p|pre|section|table|ul';
    if (preg_match('~<('.$blocks.')~is', $m[1])) {
         // THIS LINK CONTAINS BLOCK ELEMENT
        return '<alink'.$m[1].'</alink>';
    }
    return $m[0];
}

static function Tidy($html)
{
    $config = array(
        'wrap' => 0,
        'show-body-only' => true,
        'enclose-text' => true,
        'output-xhtml' => true,
        'doctype' => 'omit',
        'bare' => true,
        'char-encoding' => 'raw',
        'input-encoding' => 'raw',
        'output-encoding' => 'raw',
        'quiet' => true,
        'hide-comments' => true,
        'new-blocklevel-tags' => 'section alink',
        'new-inline-tags' => 'button',
        'drop-empty-elements' => false,
    );

   // you cannot simply replace all <A> tags because Tidy would mess up the inline ones
    $html = preg_replace_callback('~<a(.*)</a>~isU', array(self, 'tidy_links_cb'), $html);

    $html = tidy_parse_string($html, $config);
    tidy_clean_repair($html);

    $html = str_replace('<alink ', '<a ', $html);
    $html = str_replace('alink>', 'a>', $html);

    $html = (string) $html;
    return $html;
}
like image 33
Martin Zvarík Avatar answered Mar 30 '26 06:03

Martin Zvarík



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!