I have to store the product of several probabilty values that are really low (for example, 1E-80). Using the primitive java double would result in zero because of the underflow. I don't want the value to go to zero because later on there will be a larger number (for example, 1E100) that will bring the values within the range that the double can handle.
So, I created a different class (MyDouble) myself that works on saving the base part and the exponent parts. When doing calculations, for example multiplication, I multiply the base parts, and add the exponents.
The program is fast with the primitive double type. However, when I use my own class (MyDouble) the program is really slow. I think this is because of the new objects that I have to create each time to create simple operations and the garbage collector has to do a lot of work when the objects are no longer needed.
My question is, is there a better way you think I can solve this problem? If not, is there a way so that I can speedup the program with my own class (MyDouble)?
[Note: taking the log and later taking the exponent does not solve my problem]
MyDouble class:
public class MyDouble {
public MyDouble(double base, int power){
this.base = base;
this.power = power;
}
public static MyDouble multiply(double... values) {
MyDouble returnMyDouble = new MyDouble(0);
double prodBase = 1;
int prodPower = 0;
for( double val : values) {
MyDouble ad = new MyDouble(val);
prodBase *= ad.base;
prodPower += ad.power;
}
String newBaseString = "" + prodBase;
String[] splitted = newBaseString.split("E");
double newBase = 0; int newPower = 0;
if(splitted.length == 2) {
newBase = Double.parseDouble(splitted[0]);
newPower = Integer.parseInt(splitted[1]);
} else {
newBase = Double.parseDouble(splitted[0]);
newPower = 0;
}
returnMyDouble.base = newBase;
returnMyDouble.power = newPower + prodPower;
return returnMyDouble;
}
}
The way this is solved is to work in log space---it trivialises the problem. When you say it doesn't work, can you give specific details of why? Probability underflow is a common issue in probabilistic models, and I don't think I've ever known it solved any other way.
Recall that log(a*b) is just log(a) + log(b). Similarly log(a/b) is log(a) - log(b). I assume since you're working with probabilities its multiplication and division that are causing the underflow issues; the drawback of log space is that you need to use special routines to calculate log(a+b), which I can direct you to if this is your issue.
So the simple answer is, work in log space, and re-exponentiate at the end to get a human-readable number.
You trying to parse strings each time you doing multiply. Why don't you calculate all values into some structure like real and exponential part as pre-calculation step and then create algorithms for multiplication, adding, subdivision, power and other.
Also you could add flag for big/small numbers. I think you will not use both 1e100 and 1e-100 in one calculation (so you could simplify some calculations) and you could improve calculation time for different pairs (large, large), (small, small), (large, small).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With