Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compute the multivariate normal CDF in Java

Does anyone know of a reliable, accurate library to compute the multivariate normal (MVN) CDF in Java? I'm looking for something like MATLAB's mvncdf function. I need to be able to do it for dimensions of up to 10 or more. Most statistics/math libraries don't have this functionality. Being able to compute the log probability is a plus.

From this post, there doesn't seem to be a free implementation mentioned for some other languages. While a straight-java implementation would rock, I would accept implementations in other languages that don't require licenses (not MATLAB or IMSL, for example) and can be called easily from Java with minimal overhead.

(This question is a derivative of a post on StackExchange math where I am trying to compute a probability of an ordering of normal random variables...if you're interested in trying to solve the problem directly using other mathematical methods, please do check it out.)

like image 698
Andrew Mao Avatar asked Jan 06 '13 02:01

Andrew Mao


2 Answers

After doing some additional research, it seems that the following is the most reasonable way to go.

The multivariate normal CDF is not trivial to compute (especially for large dimensions) and there have been several academic papers written on the subject. Professor Alan Genz has a bunch of Fortran-77 subroutines that compute various multivariate densities and CDFs, available on his page here: http://www.math.wsu.edu/faculty/genz/software/software.html

As you can see from some of that code, it's not exactly a cakewalk to re-implement in another language, and that's probably why it hasn't been done unless someone has paid for it. A lot of mathematical/numerical programming is done in Fortran at the research level, so that's where most of the best code is.

As such, for optimal results, it would probably be best to call the (native-compiled) Fortran subroutine directly using JNI or JNA. JNA seems to be the easiest to implement, following instructions such as these: http://www.javaforge.com/wiki/66061. Using that, and some other references, I've implemented the Java-JNA-Fortran link to be able to call the MVNEXP (expected value) and MVNDST (cdf) subroutines. You can check out the code here:

  • Java: https://github.com/mizzao/libmao/tree/master/src/main/java/net/andrewmao/probability
  • Fortran (modified) and Makefile: https://github.com/mizzao/libmao/tree/master/src/main/fortran

Also to point out: there does exist native Java code for some bivariate distributions and other things you won't find in commons math; it's adapted from the source above: http://www.iro.umontreal.ca/~simardr/ssj/indexe.html . This is a very good math library that I hadn't found until now.

like image 95
Andrew Mao Avatar answered Nov 03 '22 22:11

Andrew Mao


Adding to the OP's solution (e.g.: that the best option was fortran code, and nothing else identified came close), one way to get to a pure java library is with the the f2j compiler (fortran to java) http://icl.cs.utk.edu/f2j

I've found the code it generates it be quite workable (e.g. such as this minpack library: http://www1.fpl.fs.fed.us/optimization/LmderTest_f77.html ). The only annoyance I recall was that arrays start from '1' vs '0', but that can be dealt with easily (if you care) with a trivial wrapper function.

@Andrew: if you do convert it, I'd be interested!

like image 20
Matt S. Avatar answered Nov 04 '22 00:11

Matt S.