Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Java data structure that is effectively an ArrayList with double indicies and built-in interpolation?

I am looking for a pre-built Java data structure with the following characteristics:

  1. It should look something like an ArrayList but should allow indexing via double-precision rather than integers. Note that this means that it's likely that you'll see indicies that don't line up with the original data points (i.e., asking for the value that corresponds to key "1.5"). EDIT: For clarity, based on the comments, I'm not looking to change the ArrayList implementation. I'm looking for a similar interface and developer experience.

  2. As a consequence, the value returned will likely be interpolated. For example, if the key is 1.5, the value returned could be the average of the value at key 1.0 and the value at key 2.0.

  3. The keys will be sorted but the values are not ensured to be monotonically increasing. In fact, there's no assurance that the first derivative of the values will be continuous (making it a poor fit for certain types of splines).

  4. Freely available code only, please.

For clarity, I know how to write such a thing. In fact, we already have an implementation of this and some related data structures in legacy code that I want to replace due to some performance and coding issues.

What I'm trying to avoid is spending a lot of time rolling my own solution when there might already be such a thing in the JDK, Apache Commons or another standard library. Frankly, that's exactly the approach that got this legacy code into the situation that it's in right now....

Is there such a thing out there in a freely available library?

like image 836
Bob Cross Avatar asked Apr 20 '10 14:04

Bob Cross


People also ask

Which data structure is used in ArrayList in Java?

Internally an ArrayList uses an Object[] Array which is an array of objects. All operation like deleting, adding, and updating the elements happens in this Object[] array.

What type of data structure is an ArrayList?

In Java, the ArrayList is a resizable array data structure that implements the List interface. Resizable arrays, also called dynamic arrays, are data structures that store elements in sequential order and whose size can be increased or decreased by adding or removing elements.

Is ArrayList a dynamic data structure?

ArrayLists. An ArrayList is a built-in data structure that uses a dynamic array to store its elements.

How does ArrayList grow dynamically?

ArrayList is a class in java. util package which implements dynamic-sized arrays. ArrayList dynamically grows and shrinks in size on the addition and removal of elements respectively. ArrayList inherits the AbstractList class and implements the List, RandomAccess and java.


2 Answers

Allowing double values as indices is a pretty large change from what ArrayList does.

The reason for this is that an array or list with double as indices would almost by definition be a sparse array, which means it has no value (or depending on your definition: a fixed, known value) for almost all possible indices and only a finite number of indices have an explicit value set.

There is no prebuilt class in Java SE that supports all that.

Personally I'd implement such a data structure as a skip-list (or similar fast-searching data structure) of (index, value) tuples with appropriate interpolation.

Edit: Actually there's a pretty good match for the back-end storage (i.e. everything except for the interpolation): Simply use a NavigableMap such as a TreeMap to store the mapping from index to value.

With that you can easily use ceilingEntry() and (if necessary) higherEntry() to get the closest value(s) to the index you need and then interpolate from those.

like image 69
Joachim Sauer Avatar answered Oct 10 '22 01:10

Joachim Sauer


If your current implementation has complexity O(log N) for interpolating a value, the implementation I just made up may be for you:

package so2675929;

import java.util.Arrays;

public abstract class AbstractInterpolator {
  private double[] keys;
  private double[] values;
  private int size;

  public AbstractInterpolator(int initialCapacity) {
    keys = new double[initialCapacity];
    values = new double[initialCapacity];
  }

  public final void put(double key, double value) {
    int index = indexOf(key);
    if (index >= 0) {
      values[index] = value;
    } else {
      if (size == keys.length) {
        keys = Arrays.copyOf(keys, size + 32);
        values = Arrays.copyOf(values, size + 32);
      }
      int insertionPoint = insertionPointFromIndex(index);
      System.arraycopy(keys, insertionPoint, keys, insertionPoint + 1, size - insertionPoint);
      System.arraycopy(values, insertionPoint, values, insertionPoint + 1, size - insertionPoint);
      keys[insertionPoint] = key;
      values[insertionPoint] = value;
      size++;
    }
  }

  public final boolean containsKey(double key) {
    int index = indexOf(key);
    return index >= 0;
  }

  protected final int indexOf(double key) {
    return Arrays.binarySearch(keys, 0, size, key);
  }

  public final int size() {
    return size;
  }

  protected void ensureValidIndex(int index) {
    if (!(0 <= index && index < size))
      throw new IndexOutOfBoundsException("index=" + index + ", size=" + size);
  }

  protected final double getKeyAt(int index) {
    ensureValidIndex(index);
    return keys[index];
  }

  protected final double getValueAt(int index) {
    ensureValidIndex(index);
    return values[index];
  }

  public abstract double get(double key);

  protected static int insertionPointFromIndex(int index) {
    return -(1 + index);
  }
}

The concrete interpolators will only have to implement the get(double) function.

For example:

package so2675929;

public class LinearInterpolator extends AbstractInterpolator {

  public LinearInterpolator(int initialCapacity) {
    super(initialCapacity);
  }

  @Override
  public double get(double key) {
    final double minKey = getKeyAt(0);
    final double maxKey = getKeyAt(size() - 1);
    if (!(minKey <= key && key <= maxKey))
      throw new IndexOutOfBoundsException("key=" + key + ", min=" + minKey + ", max=" + maxKey);

    int index = indexOf(key);
    if (index >= 0)
      return getValueAt(index);

    index = insertionPointFromIndex(index);
    double lowerKey = getKeyAt(index - 1);
    double lowerValue = getValueAt(index - 1);
    double higherKey = getKeyAt(index);
    double higherValue = getValueAt(index);

    double rate = (higherValue - lowerValue) / (higherKey - lowerKey);
    return lowerValue + (key - lowerKey) * rate;
  }

}

And, finally, a unit test:

package so2675929;

import static org.junit.Assert.*;

import org.junit.Test;

public class LinearInterpolatorTest {

  @Test
  public void simple() {
    LinearInterpolator interp = new LinearInterpolator(2);
    interp.put(0.0, 0.0);
    interp.put(1.0, 1.0);

    assertEquals(0.0, interp.getValueAt(0), 0.0);
    assertEquals(1.0, interp.getValueAt(1), 0.0);
    assertEquals(0.0, interp.get(0.0), 0.0);
    assertEquals(0.1, interp.get(0.1), 0.0);
    assertEquals(0.5, interp.get(0.5), 0.0);
    assertEquals(0.9, interp.get(0.9), 0.0);
    assertEquals(1.0, interp.get(1.0), 0.0);

    interp.put(0.5, 0.0);

    assertEquals(0.0, interp.getValueAt(0), 0.0);
    assertEquals(0.0, interp.getValueAt(1), 0.0);
    assertEquals(1.0, interp.getValueAt(2), 0.0);
    assertEquals(0.0, interp.get(0.0), 0.0);
    assertEquals(0.0, interp.get(0.1), 0.0);
    assertEquals(0.0, interp.get(0.5), 0.0);
    assertEquals(0.75, interp.get(0.875), 0.0);
    assertEquals(1.0, interp.get(1.0), 0.0);
  }

  @Test
  public void largeKeys() {
    LinearInterpolator interp = new LinearInterpolator(10);
    interp.put(100.0, 30.0);
    interp.put(200.0, 40.0);

    assertEquals(30.0, interp.get(100.0), 0.0);
    assertEquals(35.0, interp.get(150.0), 0.0);
    assertEquals(40.0, interp.get(200.0), 0.0);

    try {
      interp.get(99.0);
      fail();
    } catch (IndexOutOfBoundsException e) {
      assertEquals("key=99.0, min=100.0, max=200.0", e.getMessage());
    }
    try {
      interp.get(201.0);
      fail();
    } catch (IndexOutOfBoundsException e) {
      assertEquals("key=201.0, min=100.0, max=200.0", e.getMessage());
    }
  }

  private static final int N = 10 * 1000 * 1000;

  private double measure(int size) {
    LinearInterpolator interp = new LinearInterpolator(size);
    for (int i = 0; i < size; i++)
      interp.put(i, i);
    double max = interp.size() - 1;
    double sum = 0.0;
    for (int i = 0; i < N; i++)
      sum += interp.get(max * i / N);
    return sum;
  }

  @Test
  public void speed10() {
    assertTrue(measure(10) > 0.0);
  }

  @Test
  public void speed10000() {
    assertTrue(measure(10000) > 0.0);
  }

  @Test
  public void speed1000000() {
    assertTrue(measure(1000000) > 0.0);
  }
}

So the functionality seems to work. I only measured speed in some simple cases, and these suggest that scaling will be better than linear.

Update (2010-10-17T23:45+0200): I made some stupid mistakes in checking the key argument in the LinearInterpolator, and my unit tests didn't catch them. Now I extended the tests and fixed the code accordingly.

like image 27
Roland Illig Avatar answered Oct 10 '22 00:10

Roland Illig