Approach 1:
C(n,r) = n!/(n-r)!r!
Approach 2:
In the book Combinatorial Algorithms by wilf, i have found this:
C(n,r) can be written as C(n-1,r) + C(n-1,r-1)
.
e.g.
C(7,4) = C(6,4) + C(6,3) = C(5,4) + C(5,3) + C(5,3) + C(5,2) . . . . . . . . After solving = C(4,4) + C(4,1) + 3*C(3,3) + 3*C(3,1) + 6*C(2,1) + 6*C(2,2)
As you can see, the final solution doesn't need any multiplication. In every form C(n,r), either n==r or r==1.
Here is the sample code i have implemented:
int foo(int n,int r) { if(n==r) return 1; if(r==1) return n; return foo(n-1,r) + foo(n-1,r-1); }
See output here.
In the approach 2, there are overlapping sub-problems where we are calling recursion to solve the same sub-problems again. We can avoid it by using Dynamic Programming.
I want to know which is the better way to calculate C(n,r)?.
How Do you Use NCR Formula in Probability? Combinations are a way to calculate the total number of outcomes of an event when the order of the outcomes does not matter. To calculate combinations we use the nCr formula: nCr = n! / r! * (n - r)!, where n = number of items, and r = number of items being chosen at a time.
Both approaches will save time, but the first one is very prone to integer overflow.
Approach 1:
This approach will generate result in shortest time (in at most n/2
iterations), and the possibility of overflow can be reduced by doing the multiplications carefully:
long long C(int n, int r) { if(r > n - r) r = n - r; // because C(n, r) == C(n, n - r) long long ans = 1; int i; for(i = 1; i <= r; i++) { ans *= n - r + i; ans /= i; } return ans; }
This code will start multiplication of the numerator from the smaller end, and as the product of any k
consecutive integers is divisible by k!
, there will be no divisibility problem. But the possibility of overflow is still there, another useful trick may be dividing n - r + i
and i
by their GCD before doing the multiplication and division (and still overflow may occur).
Approach 2:
In this approach, you'll be actually building up the Pascal's Triangle. The dynamic approach is much faster than the recursive one (the first one is O(n^2)
while the other is exponential). However, you'll need to use O(n^2)
memory too.
# define MAX 100 // assuming we need first 100 rows long long triangle[MAX + 1][MAX + 1]; void makeTriangle() { int i, j; // initialize the first row triangle[0][0] = 1; // C(0, 0) = 1 for(i = 1; i < MAX; i++) { triangle[i][0] = 1; // C(i, 0) = 1 for(j = 1; j <= i; j++) { triangle[i][j] = triangle[i - 1][j - 1] + triangle[i - 1][j]; } } } long long C(int n, int r) { return triangle[n][r]; }
Then you can look up any C(n, r)
in O(1)
time.
If you need a particular C(n, r)
(i.e. the full triangle is not needed), then the memory consumption can be made O(n)
by overwriting the same row of the triangle, top to bottom.
# define MAX 100 long long row[MAX + 1]; int C(int n, int r) { int i, j; // initialize by the first row row[0] = 1; // this is the value of C(0, 0) for(i = 1; i <= n; i++) { for(j = i; j > 0; j--) { // from the recurrence C(n, r) = C(n - 1, r - 1) + C(n - 1, r) row[j] += row[j - 1]; } } return row[r]; }
The inner loop is started from the end to simplify the calculations. If you start it from index 0, you'll need another variable to store the value being overwritten.
I think your recursive approach should work efficiently with DP
. But it will start giving problems once the constraints increase. See http://www.spoj.pl/problems/MARBLES/
Here is the function which i use in online judges and coding contests. So it works quite fast.
long combi(int n,int k) { long ans=1; k=k>n-k?n-k:k; int j=1; for(;j<=k;j++,n--) { if(n%j==0) { ans*=n/j; }else if(ans%j==0) { ans=ans/j*n; }else { ans=(ans*n)/j; } } return ans; }
It is an efficient implementation for your Approach #1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With