Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between scipy.special.binom and scipy.misc.comb?

Tags:

python

scipy

What is the difference between scipy.special.binom and scipy.misc.comb?

In ipython I can see they return different types and also have different accuracy.

scipy.special.binom(4,3)
4.0

scipy.misc.comb(4,3)
array(4.000000000000001)

However what exactly are they doing differently?


Looking at https://github.com/scipy/scipy/blob/master/scipy/special/generate_ufuncs.py , scipy.special.binom says

binom -- binom: dd->d                                      -- orthogonal_eval.pxd

scipy.misc.comb calls scipy.special.gammaln whose line in generate_ufuncs.py says

gammaln -- lgam: d->d, clngamma_wrap: D->D                 -- cephes.h, specfun_wrappers.h
like image 398
graffe Avatar asked Feb 03 '14 19:02

graffe


1 Answers

Whenever you come across a situation where you do not know what some code is doing and it is not straightforwardly easy to simply import the parent module in some interpreter and inspect the execution of the code, the docstrings, etc., then you have a few options.

Let me describe two options that did not turn out to be very helpful this time, but which are a great way to start on a question like this in the future (so that you can include output from these attempts to prove to folks that you tried some stuff before posting here, and to look fancy for knowing what these things are):

You can use the dis module to disassemble the Python code into executed op-codes, like so:

python -c "import dis; from scipy import misc; print dis.dis('misc.comb(4,3)')"
python -c "import dis; from scipy import special; print dis.dis('special.binom(4,3)')"

If using a *nix OS, you can also (almost always) use strace to launch something and inspect the system calls that are made. In this case you could look at the output of

strace -qc python -c "from scipy import special; special.binom(4,3)"

versus

strace -qc python -c "from scipy import misc; special.comb(4,3)"

(the -qc option makes the output less chatty and aggregates the time spent in different system calls, which can be easier to read as a first start. If you omit the -qc part, you'll get a big screen dump of all the system calls... something you'll want to open in Emacs and search through, or pipe to a system tool for searching).

In this case, strace doesn't help too much and there's a lot of noisy system calls related to just the imports of the modules.

What worked for me this time is cProfile:

python -c "import cProfile; cProfile.run('from scipy import special; special.binom(4,3)')"

In this case, I was able to see that the first execution boils down to a system call into scipy.special.orthogonal_eval and Googling this module revealed that we are talking about a shared library, built as the file orthogonal_eval.so and I found this nice page with the source code.

You can see the full function definition of binom there, which includes a standard calculation of the factorials involved in the formula for small values, and some approximation with other special functions (I see some calls to "Gamma", "beta" and "lbeta" defined in some file called cephes.h).

Looks pretty standard -- if I needed to, I can go dig around that .h file and Google some more and probably find a long-standing C library of special functions at the bottom of all of that.

Meanwhile, for misc.comb first consider that the regular documentation of this is available (whereas, for me, it is not available for binom).

The docstring says there is a possible 3rd argument, exact which can be set to something other than 0 if you don't want a float to be returned. In that case, a long is returned, but you could cast to int if you'd like.

This explains the precision question. You can read in orthogonal_eval how if k is relatively small, and the answer will give an integer (integer arguments), then something with less round off error is used. Whereas comb will only do this if you say exact=1 (or anything else such that exact != 0).

As for the code that comb is executing, well, we can view the source that's linked from the SciPy documentation page.

This function is also making use of some helper function calls from scipy.special but the mix between what is called in the Python layer vs. the C layer is different, and the implementation for the approximation part is slightly different as well.

I'm not sure what level of specificity you need, but I would say that for most purposes, this level of detail should suffice: binom is implemented directly in the C extension layer in orthogonal_eval and makes some tweaks to reduce precision issues for small inputs. misc implements the small input stuff in Python directly and makes use of special calls without passing through binom itself -- so there was some amount of reinventing the wheel for whoever programmed these. Since they mix and match function calls between the Python layer and C layer, it's not surprising that there would be some precision differences.

like image 120
ely Avatar answered Sep 27 '22 23:09

ely