Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Plotting a Combinatorial Function

Tags:

math

r

I am working in the R programming language.

I am trying to make a plot from n=0 to n = 1000 for the following function (link: https://www.youtube.com/watch?v=iH2kATv49rc&t=7s):

enter image description here

First, I tried to define the P_00(2n) part:

combinatorial_square <- function(n) {
  (choose(2*n, n)^2) * ((1/4)^(2*n))
}

Then, I tried to write another function that performs the cumulative sum of the above function:

cumulative_sum <- function(N) {
  s <- 0
  for (n in 1:N) {
    s <- s + (choose(2*n, n)^2) * ((1/4)^(2*n))
  }
  return(s)
}

Then, I tried to plot it over a range of N's:

N <- 1:250
y <- sapply(N, cumulative_sum)

plot(N, y, type = "l", main = "Plot of cumulative_sum function", xlab = "N", ylab = "cumulative_sum(N)")

enter image description here

My Problem: I can not seem to plot this function for larger values of N (I think this might be because the computer is not able to calculate large combinatorial terms?):

 N <- 1:1000
    y <- sapply(N, cumulative_sum)

> tail(y)
[1] NaN NaN NaN NaN NaN NaN

Is there something I can do in R to approximate these larger factorials?

Currently, I was thinking of using some mathematical method to approximate the larger factorial terms involved in combinatorial expressions (e.g. https://en.wikipedia.org/wiki/Stirling%27s_approximation) - that is, re-write the cumulative_sum function so that it uses Stirling Approximation for each factorial term.

But is there an easier way to do this?

Thanks!

like image 800
stats_noob Avatar asked Oct 12 '25 16:10

stats_noob


2 Answers

The trick is to use logarithms. Base R has logarithm versions of the beta, gamma, choose and factorial functions. These versions are prefixed with an l, are numerically stable and can be invaluable when the numbers become very large. The relevant function is lchoose.

First test the functions' results.

combinatorial_square <- function(n) {
  (choose(2*n, n)^2) * ((1/4)^(2*n))
}
cumulative_sum <- function(N) {
  s <- 0
  for (n in 1:N) {
    s <- s + (choose(2*n, n)^2) * ((1/4)^(2*n))
  }
  return(s)
}

lcombinatorial_square <- function(n) {
  2*lchoose(2*n, n) + 2*n * log(1/4)
}

N <- 1:250
y1 <- sapply(N, combinatorial_square)
y2 <- sapply(N, lcombinatorial_square)

# all values are equal to floating-point accuracy
mapply(all.equal, log(y1), y2) |> all()
#> [1] TRUE
mapply(all.equal, y1, exp(y2)) |> all()
#> [1] TRUE

# but only 1% are exactly equal
# (note that the vectors log(y1) and y2 have the same attributes
# therefore the two instructions that follow are equivalent)
mapply(identical, y1, exp(y2)) |> mean()
#> [1] 0.012
mapply(`==`, y1, exp(y2)) |> mean()
#> [1] 0.012

# (note also that if the test is on log(y1) 3% are exactly equal)
mapply(identical, log(y1), y2) |> mean()
#> [1] 0.028
mapply(`==`, log(y1), y2) |> mean()
#> [1] 0.028

# now test with larger numbers
N <- 1:1000
y2 <- sapply(N, lcombinatorial_square)
y2 |> exp() |> is.finite() |> all()
#> [1] TRUE
y2 |> exp() |> max()
#> [1] 0.25
y2 |> exp() |> sum()
#> [1] 2.265321

Created on 2023-11-20 with reprex v2.0.2


Finally, compute the sums with bigger N and plot the summation.

lcombinatorial_square <- function(n) 2*lchoose(2*n, n) + 2*n * log(1/4)

# cumulative_sum_2a <- function(N) {
#   s <- 0
#   for (n in seq.int(N)) {
#     e <- exp(lcombinatorial_square(n))
#     s <- s + e
#   }
#   s
# }

cumulative_sum_2 <- function(N) {
  N |> 
    seq.int() |>
    sapply(lcombinatorial_square) |> 
    exp() |> 
    sum()
}

N <- 1:1000
y <- sapply(N, cumulative_sum_2)
plot(N, y, type = "l", main = "Plot of cumulative_sum function", 
     xlab = "N", ylab = "cumulative_sum(N)")

Created on 2023-11-20 with reprex v2.0.2

like image 127
Rui Barradas Avatar answered Oct 14 '25 06:10

Rui Barradas


You got NAN since choose(2*n, n)^2 explodes for large n, but you should have way to circumvent that issue for big numbers.


You can rearrange the terms in the math expression and make sure that each item is less than 1 and decay in the power law (in your case, it is power of 2), and then you will make it "converge" (not the same concept of "convergence" in series sum of the math domain), for example

f <- Vectorize(function(n) {
  v <- seq_len(n)
  prod((n + v) / (4 * v))^2
})

and then run (up to 10000 for exmaple)

N <- 1:10000
P <- cumsum(f(N))
plot(N, P,  type = "l", main = "Plot of cumulative_sum function", xlab = "N", ylab = "cumulative_sum(N)")

and you will obtain

enter image description here

like image 23
ThomasIsCoding Avatar answered Oct 14 '25 06:10

ThomasIsCoding