Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R's t-distribution says "full precision may not have been achieved"

I am working with a problem that routinely needs to compute the density of the t distribution rather far in the tails in R.

For example, using R's t distribution function, dt(1.424781, 1486, -5) returns [1] 2.75818e-10. Some of my final outputs (using this density as an input) do not match a reference value from analogous computations performed in MATLAB by my colleague, which I think may be due to imprecision in the t distribution's tails in R.

If I compare to MATLAB's t distribution function, nctpdf(1.424781, 1486, -5) returns ans = 4.3932e-10, which is quite a bit different than R's output.

edit: R prints two identical warning messages

In dt(1.424781, 1486, -5) : full precision may not have been achieved
in 'pnt{final}'

This is on Mac, R version 3.3.1

like image 660
Alexander Etz Avatar asked Aug 27 '16 17:08

Alexander Etz


2 Answers

It appears that the issue comes from the algorithm that R implements for the non-central t-distribution for this case. Two factors combine to produce the result:

  1. the supplied non-centrality parameter, -5, is an "extreme value."

From the Note section of the help file, ?pt,

The code for non-zero ncp is principally intended to be used for moderate values of ncp: it will not be highly accurate, especially in the tails, for large values.

So the algorithm that calculates these values is not intended to calculate extreme values like -5. We can see this by reducing the ncp value to a more moderate level, say -1:

dt(1.424781, 1486, -1)
[1] 0.0211143
  1. The requested value is in the upper tail of the distribution:

The Source section of ?pt says

For the non-central case of pt based on a C translation of

Lenth, R. V. (1989). Algorithm AS 243 — Cumulative distribution function of the non-central t distribution, Applied Statistics 38, 185–189.

This computes the lower tail only, so the upper tail suffers from cancellation and a warning will be given when this is likely to be significant.

For example, the same ncp value, -5 with the negative of the x value returns

dt(-1.424781, 1486, -5)
[1] 0.0006719519

without a warning.

like image 75
lmo Avatar answered Oct 08 '22 17:10

lmo


You could use Boost (available through the BH package) via Rcpp as an alternative:

// [[Rcpp::depends(BH)]]

#include <Rcpp.h>
#include <boost/math/distributions/non_central_t.hpp> 

using namespace boost::math;

// [[Rcpp::export]]
double dnct(const double x, const double df, const double ncp) {
  non_central_t dist(df, ncp);
  return pdf(dist, x);
}


/*** R
dnct(1.424781, 1486, -5)
*/

This returns:

[1] 4.393078e-10

I don't know if Boost or Matlab is more precise here, but results are at least similar. The boost docs give some information regarding precision.

like image 3
Roland Avatar answered Oct 08 '22 18:10

Roland