Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need help installing lxml on os x 10.7

I have been struggling to be able to do from lxml import etree (import lxml works fine by the way) The error is:

ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-            packages/lxml/etree.so, 2): Symbol not found: _htmlParseChunk
Referenced from: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml/etree.so
Expected in: flat namespace
in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml/etree.so

i used pip to install lxml, and homebrew to reinstall libxml2 with the right architecture (or so i think) ...does anyone have ideas on how to fix/diagnose this? I'm on 64 bit python

like image 704
Pat B Avatar asked Nov 01 '11 01:11

Pat B


1 Answers

lxml is a bit fussy about what 3rd-party libraries it uses and it often needs newer versions than what are supplied by Apple. Suggest you read and follow the instructions here for building lxml from source on Mac OS X including building its own statically linked libs. That should work. (I'm a little surprised that homebrew doesn't already have an lxml recipe.)

UPDATE: Based on the limited information in your comments, it is difficult to be sure exactly what is happening. I suspect you are not using the version of Python you think you are. There are any number of ways to install lxml successfully; that's part of the problem: there are too many options. Rather than trying to debug your setup, here's probably the simplest way to get a working lxml on 10.7 using the Apple-supplied system Python 2.7.

$ sudo STATIC_DEPS=true /usr/bin/easy_install-2.7 lxml

You should then be able to use lxml.etree this way:

$ /usr/bin/python2.7
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> etree.__file__
'/Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so'
>>> 

I notice though that the lxml static build process does not produce a working universal build. You'll probably see messages like this during the lxml install:

ld: warning: ignoring file /private/tmp/easy_install-83mJsV/lxml-2.3.1/build/tmp/libxml2/lib/libxslt.a, file was built for archive which is not the architecture being linked (i386)

Assuming the default architecture on your machine is 64-bits, if you try to run in 32-bit mode:

$ arch -i386 /usr/bin/python2.7
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:06) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: dlopen(/Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so, 2): Symbol not found: _htmlParseChunk
  Referenced from: /Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so
  Expected in: flat namespace
 in /Library/Python/2.7/site-packages/lxml-2.3.1-py2.7-macosx-10.7-intel.egg/lxml/etree.so
>>> ^D

And there is the error message you originally reported! So the root cause of that appears to be that the static libraries (libxml2 etc) that lxml builds are not universal. As long as you have no need to use lxml in a 32-bit process (unlikely for most uses), this should not be a problem. Chances are that the Python you were originally using was a 32-bit-only one; that is consistent with some of the other messages you reported.

like image 92
Ned Deily Avatar answered Sep 21 '22 18:09

Ned Deily