When using PHP Simple HTML DOM Parser, is it normal that line breaks
tags are stripped out?
I know this is old, but I was looking for this as well, and realized there was actually a built in option to turn off the removal of line breaks. No need to go editing the source.
The PHP Simple HTML Dom Parser's load
function supports multiple useful parameters:
load($str, $lowercase=true, $stripRN=false, $defaultBRText=DEFAULT_BR_TEXT)
When calling the load
function, simply pass false
as the third parameter.
$html = new simple_html_dom();
$html->load("<html><head></head><body>stuff</body></html>", true, false);
If using file_get_html
, it's the ninth parameter.
file_get_html($url, $use_include_path = false, $context=null, $offset = -1, $maxLen=-1, $lowercase = true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT)
Edit: For str_get_html
, it's the fifth parameter (Thanks yitwail)
str_get_html($str, $lowercase=true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT, $defaultSpanText=DEFAULT_SPAN_TEXT)
Was struggling with this as well, since I needed the HTML to be easily editable after processing.
Apparently there's a boolean in the SimpleHTMLDOM
script $stripRN
, that's set to true
on default. It strips the \r
, \n
or \r\n
tags in the HTML.
Set the var to false
(several occurences in the script..) and your problem is solved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With