I'm doing some probability calculation.
In one of my task, I need to multiply the combination number of choose 8000 samples from 10000 items with 0.8**8000.
The combination number is a long long-number
, and with the help of numpy, I get the result of 0.8**8000
as 5.2468172239242176864e-776
.
But when I try to multiply these two numbers, I got [9] 34845 segmentation fault ipython -i
.
How can I do such multiplication then?
PS: This is a piece of my code
import numpy
d2 = numpy.float128(0.8) ** 8000
d1 = 165555575235503558460892983752748337696863078099010763950122624527927836980322780662408249953188062227721112100054260160204180655980717428736444016909193193353770953722788106404786520413339850951599929567643032803416164290936680088121145665954509987077953596641237451927908536624592636591471456488142060812180933761408708169972797751139799352908109763166895772281109195968567911923343187466596002627570139321755043803267091330804414889831229832744256038117150720178689066894068507531026417815624234453195871008113238128934831837842040515600131726096039123279876153916504647241693083829553081901075278042326502699324012014817969085443550523855284341221708045253558716789811929298590803855947461554713178815399150688529048306222786951038548880400191620565711291586700534540755526276938422405001345270278335726581375322976014611332999126216550500951669985289322635729053541565465940744524663726205818866513444952048185208697438054246674199211750006230637806394882672053335493831407089830994135058867370833787098758113596190447219426121568324685764151601296948654893782399960327514764114467176417125060133454019708700782282480571935020898204763471121684913190735908414301826140125010936910161942130277906874552721346626800201093026689035996876035329180150478191582393837824731994055511844267891121846403164857127885959745644323971338513739214928092232132691519007718752719466750891748327404893783451436251805894736392433617289459646429204124129760273396235033220480921175386059331059354409267348067375581516003852060360378571075522650956157791058846993826792047806030332676423336065499519953076910418838626376480202828151673161942289092221049283902410699951912366163469099917310239336454637062482599733606299329923589714875696509548029668358723465427602758225427644633549944802010973352599970041918971524450218727345622721744933664742499521140235707102217164259438766026322532351208348119475549696983427008567651685921355966036780080415723688044325099562693124488758728102729947753752228785786200998322978801432511608341549234067324280214361346940194251357867820535466891356019219904248859277399657389914429390105240751239760865282709465029549690591863591028864648910033430400L
print d1 * d2
When multiplying an extremely large number by an extremely small number, working with floats can introduce huge inaccuracies. In your case, the magnitude of the numbers is causing overflow errors, so you have bigger problems than just inaccuracies!
Whenever you find yourself in this situation, it can be useful to first check if it is possible to stay in the integer domain, and "massage" the numbers a little first. In your case, it is possible and I'll explain how below.
One operand of the multiplication, the extremely large number, is 8000 samples from 10000 items. Use the closed form equation for the number of combinations, where your sample size n
is 10000 and the subset size r
is 8000. Exclam (!) here is factorial, which you can find in math.factorial
in python.
C(n,r) = n! / r! (n - r)!
The other operand 0.8 ** 8000
is the extremely small number, which by index laws is equal to:
8**8000 / 10**8000
So when we multiply these two numbers together, the answer we want is:
10000! * 8**8000
--------------------------
8000! * 2000! * 10**8000
Let's call this number x
and then take logarithms of both sides. Working in the log domain will transform multiplications into additions, and divisions into subtractions, making things more manageable.
from math import log, factorial
numerator = log(factorial(10000)) + 8000*log(8)
denominator = log(factorial(8000)) + log(factorial(2000)) + 8000*log(10)
log_x = numerator - denominator
Now these numbers are of a magnitude that is usable in python.
You will find that log_x
is equal to approximately 3214. You now only need to observe that exp(log_x) == x
to find your answer. It is a very large, but finite, number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With