So i'm implementing a heuristic algorithm, and i've come across this function. I have an array of 1 to n (0 to n-1 on C, w/e). I want to choose a number of elements i'll copy to another array. Given a parameter y, (0 < y <= 1), i want to have a distribution of numbers whose average is (y * n). That means that whenever i call this function, it gives me a number, between 0 and n, and the average of these numbers is y*n. According to the author, "l" is a random number: 0 < l < n . On my test code its currently generating 0 <= l <= n. And i had the right code, but i'm messing with this for hours now, and i'm lazy to code it back. So i coded the first part of the function, for y <= 0.5 I set y to 0.2, and n to 100. That means it had to return a number between 0 and 99, with average 20. And the results aren't between 0 and n, but some floats. And the bigger n is, smaller this float is. This is the C test code. "x" is the "l" parameter. <pre class="prettyprint"><code>//hate how code tag works, it's not even working now int n = 100; float y = 0.2; float n_copy; for(int i = 0 ; i < 20 ; i++) { float x = (float) (rand()/(float)RAND_MAX); // 0 <= x <= 1 x = x * n; // 0 <= x <= n float p1 = (1 - y) / (n*y); float p2 = (1 - ( x / n )); float exp = (1 - (2*y)) / y; p2 = pow(p2, exp); n_copy = p1 * p2; printf("%.5f\n", n_copy); } </code></pre> And here are some results (5 decimals truncated): <pre class="prettyprint"><code>0.03354 0.00484 0.00003 0.00029 0.00020 0.00028 0.00263 0.01619 0.00032 0.00000 0.03598 0.03975 0.00704 0.00176 0.00001 0.01333 0.03396 0.02795 0.00005 0.00860 </code></pre> The article is: http://www.scribd.com/doc/3097936/cAS-The-Cunning-Ant-System pages 6 and 7. or search "cAS: cunning ant system" on google. So what am i doing wrong? i don't believe the author is wrong, because there are more than 5 papers describing this same function. all my internets to whoever helps me. This is important to my work. Thanks :)

You may misunderstand what is expected of you. Given a (properly normalized) PDF, and wanting to throw a random distribution consistent with it, you form the Cumulative Probability Distribution (CDF) by integrating the PDF, then invert the CDF, and use a uniform random predicate as the argument of the inverted function. <hr> A little more detail. <code>f_s(l)</code> is the PDF, and has been normalized on <code>[0,n)</code>. Now you integrate it to form the CDF <pre class="prettyprint"><code>g_s(l') = \int_0^{l'} dl f_s(l) </code></pre> Note that this is a definite integral to an unspecified endpoint which I have called <code>l'</code>. The CDF is accordingly a function of <code>l'</code>. Assuming we have the normalization right, <code>g_s(N) = 1.0</code>. If this is not so we apply a simple coefficient to fix it. Next invert the CDF and call the result <code>G^{-1}(x)</code>. For this you'll probably want to choose a particular value of gamma. Then throw uniform random number on <code>[0,n)</code>, and use those as the argument, <code>x</code>, to <code>G^{-1}</code>. The result should lie between <code>[0,1)</code>, and should be distributed according to <code>f_s</code>. Like Justin said, you can use a computer algebra system for the math.

Probability density function from a paper, implemented using C++, not working as intended

Q: How is probability density function implemented?

Probability Density Function Formula This is because, when X is continuous, we can ignore the endpoints of intervals while finding probabilities of continuous random variables. That means, for any constants a and b, P(a ≤ X ≤ b) = P(a < X ≤ b) = P(a ≤ X < b) = P(a < X < b).

Q: Why probability density function is not probability?

Because, for continuous random variables, the probability that 𝐗 takes on any particular value 𝒙 is 0.

Tags:

c++

probability

heuristics

montecarlo

ant-colony

So i'm implementing a heuristic algorithm, and i've come across this function.

I have an array of 1 to n (0 to n-1 on C, w/e). I want to choose a number of elements i'll copy to another array. Given a parameter y, (0 < y <= 1), i want to have a distribution of numbers whose average is (y * n). That means that whenever i call this function, it gives me a number, between 0 and n, and the average of these numbers is y*n.

According to the author, "l" is a random number: 0 < l < n . On my test code its currently generating 0 <= l <= n. And i had the right code, but i'm messing with this for hours now, and i'm lazy to code it back.

So i coded the first part of the function, for y <= 0.5 I set y to 0.2, and n to 100. That means it had to return a number between 0 and 99, with average 20. And the results aren't between 0 and n, but some floats. And the bigger n is, smaller this float is.

This is the C test code. "x" is the "l" parameter.

//hate how code tag works, it's not even working now  
int n = 100;  
float y = 0.2;  
float n_copy;  

for(int i = 0 ; i < 20 ; i++)  
{  
    float x = (float) (rand()/(float)RAND_MAX);  // 0 <= x <= 1  
    x = x * n;                                // 0 <= x <= n  
    float p1 = (1 - y) / (n*y);  
    float p2 = (1 - ( x / n ));  
    float exp = (1 - (2*y)) / y;  
    p2 = pow(p2, exp);  
    n_copy = p1 * p2;  
    printf("%.5f\n", n_copy);  
}

And here are some results (5 decimals truncated):

The article is:

http://www.scribd.com/doc/3097936/cAS-The-Cunning-Ant-System

pages 6 and 7.

or search "cAS: cunning ant system" on google.

So what am i doing wrong? i don't believe the author is wrong, because there are more than 5 papers describing this same function.

all my internets to whoever helps me. This is important to my work.

Thanks :)

708

asked Nov 05 '10 03:11

hfingler

2 Answers

You may misunderstand what is expected of you.

Given a (properly normalized) PDF, and wanting to throw a random distribution consistent with it, you form the Cumulative Probability Distribution (CDF) by integrating the PDF, then invert the CDF, and use a uniform random predicate as the argument of the inverted function.

A little more detail.

f_s(l) is the PDF, and has been normalized on [0,n).

Now you integrate it to form the CDF

g_s(l') = \int_0^{l'} dl f_s(l)

Note that this is a definite integral to an unspecified endpoint which I have called l'. The CDF is accordingly a function of l'. Assuming we have the normalization right, g_s(N) = 1.0. If this is not so we apply a simple coefficient to fix it.

Next invert the CDF and call the result G^{-1}(x). For this you'll probably want to choose a particular value of gamma.

Then throw uniform random number on [0,n), and use those as the argument, x, to G^{-1}. The result should lie between [0,1), and should be distributed according to f_s.

Like Justin said, you can use a computer algebra system for the math.

124

answered Nov 15 '22 13:11

dmckee --- ex-moderator kitten

dmckee is actually correct, but I thought that I would elaborate more and try to explain away some of the confusion here. I could definitely fail. f_s(l), the function you have in your pretty formula above, is the probability distribution function. It tells you, for a given input l between 0 and n, the probability that l is the segment length. The sum (integral) for all values between 0 and n should be equal to 1.

The graph at the top of page 7 confuses this point. It plots l vs. f_s(l), but you have to watch out for the stray factors it puts on the side. You notice that the values on the bottom go from 0 to 1, but there is a factor of x n on the side, which means that the l values actually go from 0 to n. Also, on the y-axis there is a x 1/n which means these values don't actually go up to about 3, they go to 3/n.

So what do you do now? Well, you need to solve for the cumulative distribution function by integrating the probability distribution function over l which actually turns out to be not too bad (I did it with the Wolfram Mathematica Online Integrator by using x for l and using only the equation for y <= .5). That however was using an indefinite integral and you are really integration along x from 0 to l. If we set the resulting equation equal to some variable (z for instance), the goal now is to solve for l as a function of z. z here is a random number between 0 and 1. You can try using a symbolic solver for this part if you would like (I would). Then you have not only achieved your goal of being able to pick random ls from this distribution, you have also achieved nirvana.

A little more work done

I'll help a little bit more. I tried doing what I said about for y <= .5, but the symbolic algebra system I was using wasn't able to do the inversion (some other system might be able to). However, then I decided to try using the equation for .5 < y <= 1. This turns out to be much easier. If I change l to x in f_s(l) I get

y / n / (1 - y) * (x / n)^((2 * y - 1) / (1 - y))

Integrating this over x from 0 to l I got (using Mathematica's Online Integrator):

(l / n)^(y / (1 - y))

It doesn't get much nicer than that with this sort of thing. If I set this equal to z and solve for l I get:

l = n * z^(1 / y - 1)      for .5 < y <= 1

One quick check is for y = 1. In this case, we get l = n no matter what z is. So far so good. Now, you just generate z (a random number between 0 and 1) and you get an l that is distributed as you desired for .5 < y <= 1. But wait, looking at the graph on page 7 you notice that the probability distribution function is symmetric. That means that we can use the above result to find the value for 0 < y <= .5. We just change l -> n-l and y -> 1-y and get

n - l = n * z^(1 / (1 - y) - 1)

l = n * (1 - z^(1 / (1 - y) - 1))      for 0 < y <= .5

Anyway, that should solve your problem unless I made some error somewhere. Good luck.

answered Nov 15 '22 12:11

Justin Peel

Related questions
                            
                                Creating objects of class through its name stored in a string?
                            
                                C++ object termination notification
                            
                                Why can't templates take function local types?
                            
                                What is the best way to sync computer time with an internet time server?
                            
                                Matrix compression methods
                            
                                How to use/install custom Directshow filter
                            
                                GlobalAddAtom() returns 0 and GetLastError = 0x5 on Win7, works on XP
                            
                                Stream of example RTP packets
                            
                                how to make sure that a file will be closed at the end of the run
                            
                                Is there a way to optimize doxygen for C++?
                            
                                error: using typedef-name after class
                            
                                Metaclass to parametrize Inheritance
                            
                                gtk minimum size
                            
                                Static member object initialization failure
                            
                                error while loading shared libraries: libstdc++.so.6: wrong ELF class: ELFCLASS64
                            
                                How to optimize a cycle?
                            
                                Syntax for calling templated method
                            
                                QApplication in unicode
                            
                                How are global variables stored?
                            
                                Objective-C enum error?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With