bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Tags:

... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

The above outputs on my Terminal. I am on Mac OS 10.7.x. I have Python 2.7.1, and followed this tutorial to get Beautiful Soup and lxml, which both installed successfully and work with a separate test file located here. In the Python script that causes this error, I have included this line: from pageCrawler import comparePages And in the pageCrawler file I have included the following two lines: from bs4 import BeautifulSoup from urllib2 import urlopen

Any help in figuring out what the problem is and how it can be solved would much be appreciated.

442

asked Jun 25 '14 00:06

user3773048

2 Answers

I have a suspicion that this is related to the parser that BS will use to read the HTML. They document is here, but if you're like me (on OSX) you might be stuck with something that requires a bit of work:

You'll notice that in the BS4 documentation page above, they point out that by default BS4 will use the Python built-in HTML parser. Assuming you are in OSX, the Apple-bundled version of Python is 2.7.2 which is not lenient for character formatting. I hit this same problem, so I upgraded my version of Python to work around it. Doing this in a virtualenv will minimize disruption to other projects.

If doing that sounds like a pain, you can switch over to the LXML parser:

pip install lxml

And then try:

soup = BeautifulSoup(html, "lxml")

Depending on your scenario, that might be good enough. I found this annoying enough to warrant upgrading my version of Python. Using virtualenv, you can migrate your packages fairly easily.

100

answered Sep 23 '22 01:09

James Errico

I'd prefer the built in python html parser, no install no dependencies

soup = BeautifulSoup(s, "html.parser")

answered Sep 25 '22 01:09

Ernst

Related questions
                            
                                Combining two Series into a DataFrame in pandas
                            
                                How to merge lists into a list of tuples?
                            
                                Search and replace a line in a file in Python
                            
                                super() raises "TypeError: must be type, not classobj" for new-style class
                            
                                How to use string.replace() in python 3.x
                            
                                Concatenating two one-dimensional NumPy arrays
                            
                                Lazy Method for Reading Big File in Python?
                            
                                How to print to console in pytest?
                            
                                Writing string to a file on a new line every time
                            
                                How to check whether a file is empty or not
                            
                                Catching an exception while using a Python 'with' statement
                            
                                What SOAP client libraries exist for Python, and where is the documentation for them? [closed]
                            
                                Is it bad to have my virtualenv directory inside my git repository?
                            
                                How to convert 'binary string' to normal string in Python3?
                            
                                Using logging in multiple modules
                            
                                What is :: (double colon) in Python when subscripting sequences?
                            
                                UnicodeEncodeError: 'charmap' codec can't encode characters
                            
                                Viewing all defined variables [duplicate]
                            
                                What is a "callable"?
                            
                                Print current call stack from a method in Python code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Tags:

python

beautifulsoup

python-2.7

lxml

user3773048

People also ask

2 Answers

James Errico

Ernst

Recent Activity

Donate For Us