I have a page that looks like this: <pre class="prettyprint"><code>Company A 123 Main St. Suite 101 Someplace, NY 1234 Company B 456 Main St. Someplace, NY 1234 </code></pre> Sometimes there are two rather than three "br" tags separating the entries. How would I use BeautifulSoup to parse through this document and extract the fields? I'm stumped because the bits of text that I need are not contained in paragraph (or similar) tags that I can simply iterate through.

You should look into the <code>.strings</code>attribute found in tags, then use "\n".join() on that.

Using BeautifulSoup to parse lines separated by tags?

Tags:

beautifulsoup

I have a page that looks like this:

Company A<br />
123 Main St.<br />
Suite 101<br />
Someplace, NY 1234<br />
<br />
<br />
<br />
Company B<br />
456 Main St.<br />
Someplace, NY 1234<br />
<br />
<br />
<br />

Sometimes there are two rather than three "br" tags separating the entries. How would I use BeautifulSoup to parse through this document and extract the fields? I'm stumped because the bits of text that I need are not contained in paragraph (or similar) tags that I can simply iterate through.

264

asked Feb 21 '10 07:02

jamieb

1 Answers

You should look into the .stringsattribute found in tags, then use "\n".join() on that.

answered Sep 23 '22 03:09

ychaouche

Related questions
                            
                                Numpy strange behavior past end of array
                            
                                Is there a shortcut in VSCode to execute current line or selection in debug REPL?
                            
                                How to install openssl 1.1.1 for python 2.7?
                            
                                How to pass command line arguments to pytest tests running in vscode
                            
                                Flask not activating debug mode
                            
                                How can I use --prefer-binary with pip in Python 3?
                            
                                mypy: Untyped decorator makes function "my_method" untyped
                            
                                Different behavior while reading DataFrame from parquet using CLI Versus executable on same environment
                            
                                What allows bare class instances to have assignable attributes?
                            
                                How can I Cause a Deadlock in MySQL for Testing Purposes
                            
                                Preventing BeautifulSoup from converting my XML tags to lowercase
                            
                                Why doesn't anyone care about this MySQLdb bug? is it a bug?
                            
                                Syntax Highlighting in Cocoa TextView? Experiences? Suggestions? Ideas? [duplicate]
                            
                                NoReverseMatch Exception help in Django
                            
                                How do you control MySQL timeouts from SQLAlchemy?
                            
                                Is there a way to debug a subprocess using pydev?
                            
                                Seeking a High-Level Library for Socket Programming (Java or Python)
                            
                                Where does Python's pydoc help function get its content?
                            
                                Which is more fundamental: Python functions or Python object-methods?
                            
                                ReportLab and Python Imaging Library images from memory issue

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using BeautifulSoup to parse lines separated by <br> tags?

Tags:

python

parsing

beautifulsoup

jamieb

People also ask

1 Answers

ychaouche

Recent Activity

Donate For Us