Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

beautifulsoup won't recognize lxml

I'm attempting to use lxml as the parser for BeautifulSoup because the default one is MUCH slower, however i'm getting this error:

    soup = BeautifulSoup(html, "lxml")
  File "/home/rob/python/stock/local/lib/python2.7/site-packages/bs4/__init__.py", line 152, in __init__
    % ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

I have uninstalled and reinstalled lxml as well as beautifulsoup many times, however it still will not read it. I've tried reinstalled lxml dependencies as well and i'm still getting this.

I even made a new virtual environment and installed fresh everything and still get this error.

Anyone have any idea whats going on here?

Edits

Using latest versions of bs4 and lxml on Python 2.7.x on ubuntu desktop

i can import lxml but i cannot from lxml import etree that is returning:

  File "<stdin>", line 1, in <module>
ImportError: /usr/lib/x86_64-linux-gnu/libxml2.so.2: version `LIBXML2_2.9.0' not found (required by /home/rob/python/stock/local/lib/python2.7/site-packages/lxml/etree.so)

i have libxml however i'm not sure the version, but i installed and reinstalled the latest. also tried to manually install 2.9.0 and still nothing

like image 521
robz228 Avatar asked Jan 24 '14 01:01

robz228


2 Answers

It looks like lxml has not been successfully installed. To install lxml on Ubuntu, run

sudo apt-get install libxslt1-dev libxml2

In virtualenv:

pip install --upgrade lxml
pip install cssselect
like image 62
unutbu Avatar answered Oct 16 '22 15:10

unutbu


Go to these pages:

  1. https://pypi.python.org/pypi/cssselect

  2. https://pypi.python.org/pypi/lxml/3.2.5

download the source files for both packages. Expand each of them into a different folder. Then in each folder locate the setup.py file and run the following command:

python setup.py install

You may run into some problems with lxml. If you get an error like

error: command 'gcc' failed with exit status 1

make sure you install libxml2-dev & libxslt1-dev using

sudo apt-get install libxml2-dev libxslt1-dev

Hopefully that should work.

like image 1
user1801060 Avatar answered Oct 16 '22 16:10

user1801060