Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I turn <br> and <p> into line breaks?

Let's say I have an HTML with <p> and <br> tags inside. Aftewards, I'm going to strip the HTML to clean up the tags. How can I turn them into line breaks?

I'm using Python's BeautifulSoup library, if that helps at all.

like image 660
TIMEX Avatar asked May 08 '12 01:05

TIMEX


People also ask

How do you type a line break?

To add spacing between lines or paragraphs of text in a cell, use a keyboard shortcut to add a new line. Click the location where you want to break the line. Press ALT+ENTER to insert the line break.

How do you put a line break in P?

To use the <br> tag, simply add the <br> tag wherever you want a line break to appear in your code. In the example above, the <br> tag is used to insert a line break between the two lines of text, all within the same <p> element. This is what the text will look like: This is a paragraph.

How do you line break without br?

A line break can be added to HTML elements without having to utilize a break return <br> by using pseudo-elements. Pseudo-elements are used to style a specific part of an element. Here we will use ::after to style an HTML element to add a line break.

What is the tag code for line break?

The <br> tag inserts a single line break. The <br> tag is useful for writing addresses or poems.


1 Answers

Without some specifics, it's hard to be sure this does exactly what you want, but this should give you the idea... it assumes your b tags are wrapped inside p elements.

from BeautifulSoup import BeautifulSoup
import six

def replace_with_newlines(element):
    text = ''
    for elem in element.recursiveChildGenerator():
        if isinstance(elem, six.string_types):
            text += elem.strip()
        elif elem.name == 'br':
            text += '\n'
    return text

page = """<html>
<body>
<p>America,<br>
Now is the<br>time for all good men to come to the aid<br>of their country.</p>
<p>pile on taxpayer debt<br></p>
<p>Now is the<br>time for all good men to come to the aid<br>of their country.</p>
</body>
</html>
"""

soup = BeautifulSoup(page)
lines = soup.find("body")
for line in lines.findAll('p'):
    line = replace_with_newlines(line)
    print line

Running this results in...

(py26_default)[mpenning@Bucksnort ~]$ python thing.py
America,
Now is the
time for all good men to come to the aid
of their country.
pile on taxpayer debt

Now is the
time for all good men to come to the aid
of their country.
(py26_default)[mpenning@Bucksnort ~]$
like image 55
Mike Pennington Avatar answered Sep 16 '22 19:09

Mike Pennington