To learn how to create C-extensions I've decided to just copy a built-in .c
-file (in this case itertoolsmodule.c
) and placed it in my package. I only changed the names inside the module from itertools
to mypkg
.
Then I compiled it (Windows 10, MSVC Community 14) as setuptools.Extension
:
from setuptools import setup, Extension
itertools_module = Extension('mypkg.itertoolscopy',
sources=['src/itertoolsmodulecopy.c'])
setup(...
ext_modules=[itertools_module])
The default uses the compiler flags /c /nologo /Ox /W3 /GL /DNDEBUG /MD
and I read somewhere that these defaults equals the settings of how the python was compiled. However I use conda (64bit setup) so this might not necessarily be true.
It all went well - but a benchmark for filterfalse
showed that it's almost a factor 2 slower than the built-in:
import mypkg
import itertools
import random
a = [random.random() for _ in range(500000)]
func = None
%timeit list(filter(func, a))
100 loops, best of 3: 3.42 ms per loop
%timeit list(itertools.filterfalse(func, a))
100 loops, best of 3: 3.41 ms per loop
%timeit list(mypkg.filterfalse(func, a))
100 loops, best of 3: 6.77 ms per loop
However, for smaller iterables the discrepancy also becomes smaller:
a = [random.random() for _ in range(500)] # 1 / 1000 of the elements
%timeit list(filter(func, a))
100000 loops, best of 3: 9.66 µs per loop
%timeit list(itertools.filterfalse(func, a))
100000 loops, best of 3: 10.8 µs per loop
%timeit list(mypkg.filterfalse(func, a))
100000 loops, best of 3: 14.4 µs per loop
I wasn't able to explain this difference in speed but I have to admit that I'm not too familiar with compiling C
-code. I'm at a loss what actually makes it slower.
The results are the same on python 2.7 with ifilter
and ifilterfalse
and the 2.7 version of the itertoolsmodule.c
file.
Does anyone knows what makes the code perform worse than the built-ins and how one could speed it up?
Extending Python with C or C++ It is quite easy to add new built-in modules to Python, if you know how to program in C. Such extension modules can do two things that can't be done directly in Python: they can implement new built-in object types, and they can call C library functions and system calls.
To write Python modules in C, you'll need to use the Python API, which defines the various functions, macros, and variables that allow the Python interpreter to call your C code. All of these tools and more are collectively bundled in the Python.
Any code that you write using any compiled language like C, C++, or Java can be integrated or imported into another Python script. This code is considered as an "extension." A Python extension module is nothing more than a normal C library. On Unix machines, these libraries usually end in . so (for shared object).
¶ A CPython extension module is a module which can be imported and used from within Python which is written in another language. Extension modules are almost always written in C, and sometimes in C++, because CPython provides an API for working with Python objects targeted at C.
Curious about this problem myself I set out to attempt to reproduce the findings. Though the OP is on windows, it was slightly easier for me to attempt this on linux. I did eventually try it on windows but I'll walk you through what I did!
I made a little test harness, it's a shell script but it makes it easier for someone else to try what I'm trying :D
#!/usr/bin/env bash
set -euxo pipefail
rm -rf itertoolsmodule.c setup.py venv
PYTHON=3.5
FUNCTION=filterfalse
INIT=PyInit_
#PYTHON=2.7
#FUNCTION=ifilterfalse
#INIT=init
wget "https://raw.githubusercontent.com/python/cpython/$PYTHON/Modules/itertoolsmodule.c"
sed -i "s/${INIT}itertools/${INIT}_myitertools/" itertoolsmodule.c
sed -i 's/"itertools"/"_myitertools"/' itertoolsmodule.c
cat > setup.py << EOF
from setuptools import setup, Extension
mod = Extension('_myitertools', ['itertoolsmodule.c'])
setup(name='foo', ext_modules=[mod])
EOF
virtualenv venv -ppython"$PYTHON"
venv/bin/pip install . -v
cat > test.py << EOF
import _myitertools
import itertools
import random
import time
a = [random.random() for _ in range(500000)]
iterations = range(10)
seconds = 5
def builtins_filter():
for _ in iterations:
list(filter(None, a))
_itertools_filterfalse = itertools.$FUNCTION
def itertools_filterfalse():
for _ in iterations:
list(_itertools_filterfalse(None, a))
_myitertools_filterfalse = _myitertools.$FUNCTION
def myitertools_filterfalse():
for _ in iterations:
list(_myitertools_filterfalse(None, a))
def runbench(func):
start = time.time()
end = start + seconds
iterations = 0
while time.time() < end:
func()
iterations += 1
return iterations
for func in (builtins_filter, itertools_filterfalse, myitertools_filterfalse):
print('*' * 79)
print(func.__name__)
print('{} iterations in {} seconds'.format(runbench(func), seconds))
EOF
(I cut out the (imo) unimportant parts):
$ ./test.sh
+ rm -rf itertoolsmodule.c setup.py venv
+ PYTHON=3.5
+ FUNCTION=filterfalse
+ INIT=PyInit_
...
+ venv/bin/pip install . -v
...
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.5m -I/tmp/foo/venv/include/python3.5m -c itertoolsmodule.c -o build/temp.linux-x86_64-3.5/itertoolsmodule.o
creating build/lib.linux-x86_64-3.5
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/itertoolsmodule.o -o build/lib.linux-x86_64-3.5/_myitertools.cpython-35m-x86_64-linux-gnu.so
...
+ venv/bin/python test.py
*******************************************************************************
builtins_filter
1401 iterations in 50 seconds
*******************************************************************************
itertools_filterfalse
1977 iterations in 50 seconds
*******************************************************************************
myitertools_filterfalse
1981 iterations in 50 seconds
+ rm -rf itertoolsmodule.c setup.py venv
+ PYTHON=2.7
+ FUNCTION=ifilterfalse
+ INIT=init
...
+ venv/bin/pip install . -v
...
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/include/python2.7 -c itertoolsmodule.c -o build/temp.linux-x86_64-2.7/itertoolsmodule.o
creating build/lib.linux-x86_64-2.7
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/itertoolsmodule.o -o build/lib.linux-x86_64-2.7/_myitertools.so
...
+ venv/bin/python test.py
*******************************************************************************
builtins_filter
871 iterations in 50 seconds
*******************************************************************************
itertools_filterfalse
1918 iterations in 50 seconds
*******************************************************************************
myitertools_filterfalse
1863 iterations in 50 seconds
For windows, I changed the script slightly so it built virtualenvs using C:\Python##\python.exe
(Using mysysgit so I have some amount of a unix toolset (bash, etc.)). Changing things from bin to Scripts (for virtualenv), etc. I don't have/use conda so these'll just be stock python on windows 10
+ rm -rf itertoolsmodule.c setup.py venv
+ PYTHON=2.7
+ FUNCTION=ifilterfalse
+ INIT=init
...
+ venv/Scripts/pip install . -v
...
C:\Users\Anthony\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -IC:\Python27\include -Ic:\users\anthony\appdata\local\temp\foo\venv\PC /Tcitertoolsmodule.c /Fobuild\temp.win32-2.7\Release\itertoolsmodule.obj
itertoolsmodule.c
creating build\lib.win32-2.7
C:\Users\Anthony\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\link.exe /DLL /nologo /INCREMENTAL:NO /LIBPATH:C:\Python27\Libs /LIBPATH:c:\users\anthony\appdata\local\temp\foo\venv\libs /LIBPATH:c:\users\anthony\appdata\local\temp\foo\venv\PCbuild /EXPORT:init_myitertools build\temp.win32-2.7\Release\itertoolsmodule.obj /OUT:build\lib.win32-2.7\_myitertools.pyd /IMPLIB:build\temp.win32-2.7\Release\_myitertools.lib /MANIFESTFILE:build\temp.win32-2.7\Release\_myitertools.pyd.manifest
...
+ venv/Scripts/python test.py
*******************************************************************************
builtins_filter
914 iterations in 50 seconds
*******************************************************************************
itertools_filterfalse
2352 iterations in 50 seconds
*******************************************************************************
myitertools_filterfalse
2266 iterations in 50 seconds
+ rm -rf itertoolsmodule.c setup.py venv
+ PYTHON=3.5
+ FUNCTION=filterfalse
+ INIT=PyInit_
...
+ venv/Scripts/pip install . -v
...
D:\Programs\VS2015\VC\BIN\amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Python35\include -IC:\Python35\include -ID:\Programs\VS2015\VC\INCLUDE -ID:\Programs\VS2015\VC\ATLMFC\INCLUDE "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\\winrt" /Tcitertoolsmodule.c /Fobuild\temp.win-amd64-3.5\Release\itertoolsmodule.obj
itertoolsmodule.c
creating C:\Temp\pip-1fnf27jo-build\build\lib.win-amd64-3.5
D:\Programs\VS2015\VC\BIN\amd64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Python35\Libs /LIBPATH:c:\users\anthony\appdata\local\temp\foo\venv\libs /LIBPATH:c:\users\anthony\appdata\local\temp\foo\venv\PCbuild\amd64 /LIBPATH:D:\Programs\VS2015\VC\LIB\amd64 /LIBPATH:D:\Programs\VS2015\VC\ATLMFC\LIB\amd64 "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.10240.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x64" /EXPORT:PyInit__myitertools build\temp.win-amd64-3.5\Release\itertoolsmodule.obj /OUT:build\lib.win-amd64-3.5\_myitertools.cp35-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.5\Release\_myitertools.cp35-win_amd64.lib
...
+ venv/Scripts/python test.py
*******************************************************************************
builtins_filter
658 iterations in 50 seconds
*******************************************************************************
itertools_filterfalse
2601 iterations in 50 seconds
*******************************************************************************
myitertools_filterfalse
2715 iterations in 50 seconds
At the very least, my tests with stock python show that the extension module does not exhibit different performance characteristics.
wellp, I spent a half hour on this and didn't produce a reproduction. Hopefully this is helpful for the next poor soul who attempts this. I can only guess that conda is doing some additional optimization and then shipping a pyconfig.h file which lies about the flags used to compile. Though to be honest, I haven't yet ventured into the conda space so I don't know how their ecosystem works
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With