Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserve Line Breaks - Simple HTML DOM Parser

Tags:

php

domparser

When using PHP Simple HTML DOM Parser, is it normal that line breaks
tags are stripped out?

like image 846
Tim Avatar asked Jan 27 '11 04:01

Tim


2 Answers

I know this is old, but I was looking for this as well, and realized there was actually a built in option to turn off the removal of line breaks. No need to go editing the source.

The PHP Simple HTML Dom Parser's load function supports multiple useful parameters:

load($str, $lowercase=true, $stripRN=false, $defaultBRText=DEFAULT_BR_TEXT)

When calling the load function, simply pass false as the third parameter.

$html = new simple_html_dom();
$html->load("<html><head></head><body>stuff</body></html>", true, false);

If using file_get_html, it's the ninth parameter.

file_get_html($url, $use_include_path = false, $context=null, $offset = -1, $maxLen=-1, $lowercase = true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT)

Edit: For str_get_html, it's the fifth parameter (Thanks yitwail)

str_get_html($str, $lowercase=true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT, $defaultSpanText=DEFAULT_SPAN_TEXT)
like image 66
Steve Avatar answered Oct 20 '22 04:10

Steve


Was struggling with this as well, since I needed the HTML to be easily editable after processing.

Apparently there's a boolean in the SimpleHTMLDOM script $stripRN, that's set to true on default. It strips the \r, \n or \r\n tags in the HTML.

Set the var to false (several occurences in the script..) and your problem is solved.

like image 27
tomhermans Avatar answered Oct 20 '22 04:10

tomhermans