Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to generate a List<Double> sequence of values given start, end, and step?

I'm actually very surprised I was unable to find the answer to this here, though maybe I'm just using the wrong search terms or something. Closest I could find is this, but they ask about generating a specific range of doubles with a specific step size, and the answers treat it as such. I need something that will generate the numbers with arbitrary start, end and step size.

I figure there has to be some method like this in a library somewhere already, but if so I wasn't able to find it easily (again, maybe I'm just using the wrong search terms or something). So here's what I've cooked up on my own in the last few minutes to do this:

import java.lang.Math;
import java.util.List;
import java.util.ArrayList;

public class DoubleSequenceGenerator {


     /**
     * Generates a List of Double values beginning with `start` and ending with
     * the last step from `start` which includes the provided `end` value.
     **/
    public static List<Double> generateSequence(double start, double end, double step) {
        Double numValues = (end-start)/step + 1.0;
        List<Double> sequence = new ArrayList<Double>(numValues.intValue());

        sequence.add(start);
        for (int i=1; i < numValues; i++) {
          sequence.add(start + step*i);
        }

        return sequence;
    }

    /**
     * Generates a List of Double values beginning with `start` and ending with
     * the last step from `start` which includes the provided `end` value.
     * 
     * Each number in the sequence is rounded to the precision of the `step`
     * value. For instance, if step=0.025, values will round to the nearest
     * thousandth value (0.001).
     **/
    public static List<Double> generateSequenceRounded(double start, double end, double step) {

        if (step != Math.floor(step)) {
            Double numValues = (end-start)/step + 1.0;
            List<Double> sequence = new ArrayList<Double>(numValues.intValue());

            double fraction = step - Math.floor(step);
            double mult = 10;
            while (mult*fraction < 1.0) {
                mult *= 10;
            }

            sequence.add(start);
            for (int i=1; i < numValues; i++) {
              sequence.add(Math.round(mult*(start + step*i))/mult);
            }

            return sequence;
        }

        return generateSequence(start, end, step);
    }

}

These methods run a simple loop multiplying the step by the sequence index and adding to the start offset. This mitigates compounding floating-point errors which would occur with continuous incrementation (such as adding the step to a variable on each iteration).

I added the generateSequenceRounded method for those cases where a fractional step size can cause noticeable floating-point errors. It does require a bit more arithmetic, so in extremely performance sensitive situations such as ours, it's nice to have the option of using the simpler method when the rounding is unnecessary. I suspect that in most general use cases the rounding overhead would be negligible.

Note that I intentionally excluded logic for handling "abnormal" arguments such as Infinity, NaN, start > end, or a negative step size for simplicity and desire to focus on the question at hand.

Here's some example usage and corresponding output:

System.out.println(DoubleSequenceGenerator.generateSequence(0.0, 2.0, 0.2))
System.out.println(DoubleSequenceGenerator.generateSequenceRounded(0.0, 2.0, 0.2));
System.out.println(DoubleSequenceGenerator.generateSequence(0.0, 102.0, 10.2));
System.out.println(DoubleSequenceGenerator.generateSequenceRounded(0.0, 102.0, 10.2));
[0.0, 0.2, 0.4, 0.6000000000000001, 0.8, 1.0, 1.2000000000000002, 1.4000000000000001, 1.6, 1.8, 2.0]
[0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0]
[0.0, 10.2, 20.4, 30.599999999999998, 40.8, 51.0, 61.199999999999996, 71.39999999999999, 81.6, 91.8, 102.0]
[0.0, 10.2, 20.4, 30.6, 40.8, 51.0, 61.2, 71.4, 81.6, 91.8, 102.0]

Is there an existing library that provides this kind of functionality already?

If not, are there any issues with my approach?

Does anyone have a better approach to this?

like image 767
NanoWizard Avatar asked Dec 21 '19 22:12

NanoWizard


4 Answers

Sequences can be easily generated using Java 11 Stream API.

The straightforward approach is to use DoubleStream:

public static List<Double> generateSequenceDoubleStream(double start, double end, double step) {
  return DoubleStream.iterate(start, d -> d <= end, d -> d + step)
      .boxed()
      .collect(toList());
}

On ranges with a large number of iterations, double precision error could accumulate resulting in bigger error closer to the end of the range. The error can be minimised by switching to IntStream and using integers and single double multiplier:

public static List<Double> generateSequenceIntStream(int start, int end, int step, double multiplier) {
  return IntStream.iterate(start, i -> i <= end, i -> i + step)
      .mapToDouble(i -> i * multiplier)
      .boxed()
      .collect(toList());
}

To get rid of a double precision error at all, BigDecimal can be used:

public static List<Double> generateSequenceBigDecimal(BigDecimal start, BigDecimal end, BigDecimal step) {
  return Stream.iterate(start, d -> d.compareTo(end) <= 0, d -> d.add(step))
      .mapToDouble(BigDecimal::doubleValue)
      .boxed()
      .collect(toList());
}

Examples:

public static void main(String[] args) {
  System.out.println(generateSequenceDoubleStream(0.0, 2.0, 0.2));
  //[0.0, 0.2, 0.4, 0.6000000000000001, 0.8, 1.0, 1.2, 1.4, 1.5999999999999999, 1.7999999999999998, 1.9999999999999998]

  System.out.println(generateSequenceIntStream(0, 20, 2, 0.1));
  //[0.0, 0.2, 0.4, 0.6000000000000001, 0.8, 1.0, 1.2000000000000002, 1.4000000000000001, 1.6, 1.8, 2.0]

  System.out.println(generateSequenceBigDecimal(new BigDecimal("0"), new BigDecimal("2"), new BigDecimal("0.2")));
  //[0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0]
}

Method iterate with this signature (3 parameters) was added in Java 9. So, for Java 8 the code looks like

DoubleStream.iterate(start, d -> d + step)
    .limit((int) (1 + (end - start) / step))
like image 70
Evgeniy Khyst Avatar answered Oct 14 '22 10:10

Evgeniy Khyst


Me personally, I would shorten the DoubleSequenceGenerator class up a bit for other goodies and use only one sequence generator method that contains the option to utilize whatever desired precision wanted or utilize no precision at all:

In the generator method below, if nothing (or any value less than 0) is supplied to the optional setPrecision parameter then no decimal precision rounding is carried out. If 0 is supplied for a precision value then the numbers are rounded to their nearest whole number (ie: 89.674 is rounded to 90.0). If a specific precision value greater than 0 is supplied then values are converted to that decimal precision.

BigDecimal is used here for...well....precision:

import java.util.List;
import java.util.ArrayList;
import java.math.BigDecimal;
import java.math.RoundingMode;

public class DoubleSequenceGenerator {

     public static List<Double> generateSequence(double start, double end, 
                                          double step, int... setPrecision) {
        int precision = -1;
        if (setPrecision.length > 0) {
            precision = setPrecision[0];
        }
        List<Double> sequence = new ArrayList<>();
        for (double val = start; val < end; val+= step) {
            if (precision > -1) {
                sequence.add(BigDecimal.valueOf(val).setScale(precision, RoundingMode.HALF_UP).doubleValue());
            }
            else {
                sequence.add(BigDecimal.valueOf(val).doubleValue());
            }
        }
        if (sequence.get(sequence.size() - 1) < end) { 
            sequence.add(end); 
        }
        return sequence;
    }    

    // Other class goodies here ....
}

And in main():

System.out.println(generateSequence(0.0, 2.0, 0.2));
System.out.println(generateSequence(0.0, 2.0, 0.2, 0));
System.out.println(generateSequence(0.0, 2.0, 0.2, 1));
System.out.println();
System.out.println(generateSequence(0.0, 102.0, 10.2, 0));
System.out.println(generateSequence(0.0, 102.0, 10.2, 0));
System.out.println(generateSequence(0.0, 102.0, 10.2, 1));

And the console displays:

[0.0, 0.2, 0.4, 0.6000000000000001, 0.8, 1.0, 1.2, 1.4, 1.5999999999999999, 1.7999999999999998, 1.9999999999999998, 2.0]
[0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0]
[0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0]

[0.0, 10.2, 20.4, 30.599999999999998, 40.8, 51.0, 61.2, 71.4, 81.60000000000001, 91.80000000000001, 102.0]
[0.0, 10.0, 20.0, 31.0, 41.0, 51.0, 61.0, 71.0, 82.0, 92.0, 102.0]
[0.0, 10.2, 20.4, 30.6, 40.8, 51.0, 61.2, 71.4, 81.6, 91.8, 102.0]
like image 38
DevilsHnd Avatar answered Oct 14 '22 12:10

DevilsHnd


  1. Is there an existing library that provides this kind of functionality already?

    Sorry, I don't know, but judging by other answers, and their relative simplicity - no, there isn't. No need. Well, almost...

  2. If not, are there any issues with my approach?

    Yes and no. You have at least one bug, and some room for performance boost, but the approach itself is correct.

    1. Your bug: rounding error (just change while (mult*fraction < 1.0) to while (mult*fraction < 10.0) and that should fix it)
    2. All the others do not reach the end... well, maybe they just weren't observant enough to read comments in your code
    3. All the others are slower.
    4. Just changing condition in the main loop from int < Double to int < int will noticeably increase the speed of your code
  3. Does anyone have a better approach to this?

    Hmm... In what way?

    1. Simplicity? generateSequenceDoubleStream of @Evgeniy Khyst looks quite simple. And should be used... but maybe no, because of next two points
    2. Precise? generateSequenceDoubleStream is not! But still can be saved with the pattern start + step*i. And start + step*i pattern is precise. Only BigDouble and fixed-point arithmetic can beat it. But BigDoubles are slow, and manual fixed-point arithmetic is tedious and may be inappropriate for your data. By the way, on the matters of precision, you can entertain yourself with this: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
    3. Speed... well now we are on shaky grounds. Check out this repl https://repl.it/repls/RespectfulSufficientWorker I do not have a decent test stand right now, so I used repl.it... which is totally inadequate for performance testing, but it's not the main point. The point is - there is no definite answer. Except that maybe in your case, which is not totally clear from you question, you definitely should not use BigDecimal (read further).

      I've tried to play and optimize for big inputs. And your original code, with some minor changes - the fastest. But maybe you need enormous amounts of small Lists? Then that can be a totally different story.

      This code is quite simple to my taste, and fast enough:

        public static List<Double> genNoRoundDirectToDouble(double start, double end, double step) {
        int len = (int)Math.ceil((end-start)/step) + 1;
        var sequence = new ArrayList<Double>(len);
        sequence.add(start);
        for (int i=1 ; i < len ; ++i) sequence.add(start + step*i);
        return sequence;
        }
    

    If you prefer a more elegant way (or we should call it idiomatic), I, personally, would suggest:

    public static List<Double> gen_DoubleStream_presice(double start, double end, double step) {
        return IntStream.range(0, (int)Math.ceil((end-start)/step) + 1)
            .mapToDouble(i -> start + i * step)
            .boxed()
            .collect(Collectors.toList());
    }
    

    Anyway, possible performance boosts are:

    1. Try switching from Double to double, and if you really need them, you can switch back again, judging by the tests, it still may be faster. (But don't trust my, try it yourself with your data in your environment. As I said - repl.it sucks for benchmarks)
    2. A little magic: separate loop for Math.round()... maybe it has something to do with data locality. I do not recommend this - result is very unstable. But it's fun.

      double[] sequence = new double[len];
      for (int i=1; i < len; ++i) sequence[i] = start + step*i;
      List<Double> list = new ArrayList<Double>(len);
      list.add(start);
      for (int i=1; i < len; ++i) list.add(Math.round(sequence[i])/mult);
      return list;
      
    3. You should definitely consider to be more lazy and generate numbers on demand without storing then in Lists

  4. I suspect that in most general use cases the rounding overhead would be negligible.

    If you suspect something - test it :-) My answer is "Yes", but again... don't believe me. Test it.

So, back to the main question: Is there an better way?
Yes, of course!
But it depends.

  1. Choose BigDecimal if you need very big numbers and very small numbers. But if you cast them back to Double, and more than that, use it with numbers of "close" magnitude - no need for them! Checkout the same repl: https://repl.it/repls/RespectfulSufficientWorker - last test shows that there will be no difference in results, but a dig loss in speed.
  2. Make some micro-optimizations based on your data properties, your task, and your environment.
  3. Prefer short and simple code if there is not to much to gain from performance boost of 5-10%. Don't waist your time
  4. Maybe use fixed-point arithmetic if you can and if it's worth it.

Other than that, you are fine.

PS. There's also a Kahan Summation Formula implementation in the repl... just for fun. https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html#1346 and it works - you can mitigate summation errors

like image 28
x00 Avatar answered Oct 14 '22 12:10

x00


Try this.

public static List<Double> generateSequenceRounded(double start, double end, double step) {
    long mult = (long) Math.pow(10, BigDecimal.valueOf(step).scale());
    return DoubleStream.iterate(start, d -> (double) Math.round(mult * (d + step)) / mult)
                .limit((long) (1 + (end - start) / step)).boxed().collect(Collectors.toList());
}

Here,

int java.math.BigDecimal.scale()

Returns the scale of this BigDecimal. If zero or positive, the scale is the number of digits to the right ofthe decimal point. If negative, the unscaled value of the number is multiplied by ten to the power of the negation of the scale. For example, a scale of -3 means the unscaled value is multiplied by 1000.

In main()

System.out.println(generateSequenceRounded(0.0, 102.0, 10.2));
System.out.println(generateSequenceRounded(0.0, 102.0, 10.24367));

And Output:

[0.0, 10.2, 20.4, 30.6, 40.8, 51.0, 61.2, 71.4, 81.6, 91.8, 102.0]
[0.0, 10.24367, 20.48734, 30.73101, 40.97468, 51.21835, 61.46202, 71.70569, 81.94936, 92.19303]
like image 20
Maddy Avatar answered Oct 14 '22 12:10

Maddy