Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A faster alternative to DecimalFormat.format()?

Tags:

In order to improve its performance, I have been profiling one of my applications with the VisualVM sampler, using the minimum sampling period of 20ms. According to the profiler, the main thread spends almost a quarter of its CPU time in the DecimalFormat.format() method.

I am using DecimalFormat.format() with the 0.000000 pattern to "convert" double numbers to a string representation with exactly six decimal digits. I know that this method is relatively expensive and it is called a lot of times, but I was still somewhat surprised by these results.

  1. To what degree are the results of such a sampling profiler accurate? How would I go about verifying them - preferrably without resorting to an instrumenting profiler?

  2. Is there a faster alternative to DecimalFormat for my use case? Would it make sense to roll out my own NumberFormat subclass?

UPDATE:

I created a micro-benchmark to compare the performance of the following three methods:

  • DecimalFormat.format(): Single DecimalFormat object reused multiple times.

  • String.format(): Multiple independent calls. Internally this method boils down to

    public static String format(String format, Object ... args) {
        return new Formatter().format(format, args).toString();
    }
    

    Therefore I expected its performance to be very similar to Formatter.format().

  • Formatter.format(): Single Formatter object reused multiple times.

    This method is slightly awkward - Formatter objects created with the default constructor append all strings created by the format() method to an internal StringBuilder object, which is not properly accessible and therefore cannot be cleared. As a consequence, multiple calls to format() will create a concatenation of all resulting strings.

    To work around this issue, I provided my own StringBuilder instance that I cleared before use with a setLength(0) call.

The results where interesting:

  • DecimalFormat.format() was the baseline at 1.4us per call.
  • String.format() was slower by a factor of two at 2.7us per call.
  • Formatter.format() was also slower by a factor of two at 2.5us per call.

Right now it looks that DecimalFormat.format() is still the fastest among these alternatives.

like image 768
thkala Avatar asked Dec 18 '11 18:12

thkala


2 Answers

You can write your own routine given you know exactly what you want.

public static void appendTo6(StringBuilder builder, double d) {
    if (d < 0) {
        builder.append('-');
        d = -d;
    }
    if (d * 1e6 + 0.5 > Long.MAX_VALUE) {
        // TODO write a fall back.
        throw new IllegalArgumentException("number too large");
    }
    long scaled = (long) (d * 1e6 + 0.5);
    long factor = 1000000;
    int scale = 7;
    long scaled2 = scaled / 10;
    while (factor <= scaled2) {
        factor *= 10;
        scale++;
    }
    while (scale > 0) {
        if (scale == 6)
            builder.append('.');
        long c = scaled / factor % 10;
        factor /= 10;
        builder.append((char) ('0' + c));
        scale--;
    }
}

@Test
public void testCases() {
    for (String s : "-0.000001,0.000009,-0.000010,0.100000,1.100000,10.100000".split(",")) {
        double d = Double.parseDouble(s);
        StringBuilder sb = new StringBuilder();
        appendTo6(sb, d);
        assertEquals(s, sb.toString());
    }
}

public static void main(String[] args) {
    StringBuilder sb = new StringBuilder();
    long start = System.nanoTime();
    final int runs = 20000000;
    for (int i = 0; i < runs; i++) {
        appendTo6(sb, i * 1e-6);
        sb.setLength(0);
    }
    long time = System.nanoTime() - start;
    System.out.printf("Took %,d ns per append double%n", time / runs);
}

prints

Took 128 ns per append double

If you want even more performance you can write to a direct ByteBuffer (assuming you want to write the data somewhere) so the data you produce does need to be copied or encoded. (Assuming that is ok)

NOTE: this is limited to positive/negative values of less than 9 trillion (Long.MAX_VALUE/1e6) You can add special handling if this might be an issue.

like image 83
Peter Lawrey Avatar answered Nov 20 '22 22:11

Peter Lawrey


An alternative would be to use the string Formatter, give it a try to see if it performs better:

String.format("%.6f", 1.23456789)

Or even better, create a single formatter and reuse it - as long as there are no multithreading issues, since formatters are not necessarily safe for multithreaded access:

Formatter formatter = new Formatter();
// presumably, the formatter would be called multiple times
System.out.println(formatter.format("%.6f", 1.23456789));
formatter.close();
like image 28
Óscar López Avatar answered Nov 20 '22 23:11

Óscar López