I am trying to test 3rd party code with Anaconda 4.2 / Python 3.5 When I execute tests I get following exception:
Traceback (most recent call last):
File "pyspark/sql/tests.py", line 25, in <module>
import subprocess
File "/home/user/anaconda3/lib/python3.5/subprocess.py", line 364, in <module>
import signal
File "/home/user/anaconda3/lib/python3.5/signal.py", line 3, in <module>
from functools import wraps as _wraps
File "/home/user/anaconda3/lib/python3.5/functools.py", line 22, in <module>
from types import MappingProxyType
File "/home/user/Spark/spark-2.1.0-bin-hadoop2.7/python/pyspark/sql/types.py", line 22, in <module>
import calendar
File "/home/user/anaconda3/lib/python3.5/calendar.py", line 10, in <module>
import locale as _locale
File "/home/user/anaconda3/lib/python3.5/locale.py", line 108, in <module>
@functools.wraps(_localeconv)
AttributeError: module 'functools' has no attribute 'wraps'
Normally I would assume some module is shadowing built-in modules but as far as I can tell this is not the issue:
functools.__file__) from the tests and it yields expected path. Also there is nothing strange in the path I get in the exception.%run pyspark/sql/tests.py) problem disappears.functools.wraps can be imported in the shell started in the same directory and with the same configuration.virtualenv.With different version of the same project I get:
Traceback (most recent call last):
File "pyspark/sql/tests.py", line 25, in <module>
import pydoc
File "/home/user/anaconda3/lib/python3.5/pydoc.py", line 55, in <module>
import importlib._bootstrap
File "/home/user/anaconda3/lib/python3.5/importlib/__init__.py", line 57, in <module>
import types
File "/home/user/Spark/spark-1.6.3-bin-hadoop2.6/python/pyspark/sql/types.py", line 22, in <module>
import calendar
File "/home/user/anaconda3/lib/python3.5/calendar.py", line 10, in <module>
import locale as _locale
File "/home/user/anaconda3/lib/python3.5/locale.py", line 19, in <module>
import functools
File "/home/user/anaconda3/lib/python3.5/functools.py", line 22, in <module>
from types import MappingProxyType
ImportError: cannot import name 'MappingProxyType'
Is there something obvious I missed here?
Edit:
Dockerfile which can be used to reproduce the problem:
FROM debian:latest
RUN apt-get update
RUN apt-get install -y wget bzip2
RUN wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
RUN bash Anaconda3-4.2.0-Linux-x86_64.sh -b -p /anaconda3
RUN wget ftp://ftp.piotrkosoft.net/pub/mirrors/ftp.apache.org/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz
RUN tar xf spark-2.1.0-bin-hadoop2.7.tgz
ENV PATH /anaconda3/bin:$PATH
ENV SPARK_HOME /spark-2.1.0-bin-hadoop2.7
ENV PYTHONPATH $PYTHONPATH:$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$SPARK_HOME/python
WORKDIR /spark-2.1.0-bin-hadoop2.7
RUN python python/pyspark/sql/tests.py
I suspect this is happening because, the functools module of python3 has the following import: from types import MappingProxyType and, instead of picking up this module from ${CONDA_PREFIX}/lib/python3.5/types.py, it tries to import the module from inside the sql directory: ${SPARK_HOME}/python/pyspark/sql/types.py . The functools module of python2 does not have this import and hence does not throw the error.
A workaround to this, is to somehow import the required types module first and then invoke the script. As a proof of concept:
(root) ~/condaexpts$ PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$SPARK_HOME/python python
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 2 2016, 17:53:06)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import types
>>> import os
>>> sqltests=os.environ['SPARK_HOME'] + '/python/pyspark/sql/tests.py'
>>> exec(open(sqltests).read())
.....Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/01/30 05:59:43 WARN SparkContext: Support for Java 7 is deprecated as of Spark 2.0.0
17/01/30 05:59:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
...
----------------------------------------------------------------------
Ran 128 tests in 372.565s
Also note that there is nothing special about conda. One can see the same thing in a normal virtualenv (with python3):
~/condaexpts$ virtualenv -p python3 venv
Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in venv/bin/python3
Also creating executable in venv/bin/python
Installing setuptools, pip...done.
~/condaexpts$ source venv/bin/activate
(venv)~/condaexpts$ python --version
Python 3.4.3
(venv)~/condaexpts$ python $WORKDIR/python/pyspark/sql/tests.py
Traceback (most recent call last):
File "/home/ubuntu/condaexpts/spark-2.1.0-bin-hadoop2.7/python/pyspark/sql/tests.py", line 26, in <module>
import pydoc
File "/usr/lib/python3.4/pydoc.py", line 59, in <module>
import importlib._bootstrap
File "/home/ubuntu/condaexpts/venv/lib/python3.4/importlib/__init__.py", line 40, in <module>
import types
File "/home/ubuntu/condaexpts/spark-2.1.0-bin-hadoop2.7/python/pyspark/sql/types.py", line 22, in <module>
import calendar
File "/usr/lib/python3.4/calendar.py", line 10, in <module>
import locale as _locale
File "/home/ubuntu/condaexpts/venv/lib/python3.4/locale.py", line 20, in <module>
import functools
File "/home/ubuntu/condaexpts/venv/lib/python3.4/functools.py", line 22, in <module>
from types import MappingProxyType
ImportError: cannot import name 'MappingProxyType'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With