I have a Python extension which uses CPU-specific features,
if available. This is done through a run-time check. If the
hardware supports the POPCNT
instruction then it selects one
implementation of my inner loop, if SSSE3 is available then
it selects another, otherwise it falls back to generic versions
of my performance critical kernel. (Some 95%+ of the time is
spent in this kernel.)
Unfortunately, there's a failure mode I didn't expect. I
use -mssse3
and -O3
to compile all of the C code, even though
only one file needs that -mssse3
option. As a result, the other files are compiled with the expectation that SSSE3 will exist. This causes a segfault for the line:
start_target_popcount = (int)(query_popcount * threshold);
because the compiler used fisttpl
, which is an SSSE3 instruction.
After all, I told it to assume that SSSE3 exists.
The Debian packager for my package recently ran into this problem,
because the test machine has a GCC which understands -mssse3
and
generates code with that in mind, but the machine itself has an
older CPU without those instructions.
I want a solution where the same binary can work on older machines and on newer ones, that the Debian maintainer can use for that distro.
Ideally, I would like to say that only one file is compiled
with the -mssse3
option. Since my CPU-specific selector code
isn't part of this file, no SSSE3 code will ever be executed
unless the CPU supports it.
However, I can't figure out any way to tell distutils
that
a set of compiler options are specific to a single file.
Is that even possible?
It is sometimes difficult to decide what flags to set for the compiler and the best advice is to use the same flags that the version of Python you are using was compiled with. Here are a couple of ways to do that.
GCC and Clang Most common compiler flags: std - Specify the C++ version or ISO standard version. -std=c++11 (ISO C++11) -std=c++14 (ISO C++14)
From the Command Line ¶ In the Python install directory there is a pythonX.Y-config executable that can be used to extract the compiler flags where X is the major version and Y the minor version. For example (output is wrapped here for clarity):
Building a Python C Extension Module 1 Extending Your Python Program. ... 2 Writing a Python Interface in C. ... 3 Packaging Your Python C Extension Module. ... 4 Raising Exceptions. ... 5 Defining Constants. ... 6 Testing Your Module. ... 7 Considering Alternatives. ... 8 Conclusion. ...
A very ugly solution would be to create two (or more Extension
) classes, one to hold the SSSE3 code and the other for everything else. You could then tidy the interface up in the python layer.
c_src = [f for f in my_files if f != 'ssse3_file.c']
c_gen = Extension('c_general', sources=c_src,
libraries=[], extra_compile_args=['-O3'])
c_ssse3 = Extension('c_ssse_three', sources=['ssse3_file.c'],
libraries=[], extra_compile_args=['-O3', '-mssse3'])
and in an __init__.py
somewhere
from c_general import *
from c_ssse_three import *
Of course you don't need me to write out that code! And I know this isn't DRY, I look forward to reading a better answer!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With