Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

API design for functions acting on arrays

I'm designing an API in Java for a set of numerical algorithms that act on arrays of doubles (for real-time financial statistics as it happens). For performance reasons the API has to work with primitive arrays, so List<Double> and suchlike are not an option.

A typical use case might be an algorithm object that takes two input arrays, and returns an output-array that contains a result computed from the two inputs.

I'd like to establish consistent conventions for how the array parameters are used in the API, in particular:

  • Should I include an offsets with all functions so users can act on parts of a larger array e.g someFunction(double[] input, int inputOffset, int length)
  • If a function needs both input and ouput parameters, should the input or output come first in the parameter list?
  • Should the caller allocate an output array and pass it as a parameter (which could potentially be re-used), or should the function create and return an output array each time it is called?

The objectives are a to achieve a balance of efficiency, simplicity for API users and consistency both within the API and with established conventions.

Clearly there are a lot of options, so what is the best overall API design?

like image 705
mikera Avatar asked Jan 13 '12 02:01

mikera


2 Answers

So that really sounds like three questions, so here are my opinions.

Of course, this is very subjective - so - your mileage may vary:

  1. Yes. Always include offset & length. If most use cases for a particular function don't need those parameters, overload the function so that input & length are not required.

  2. For this, I would follow the standard used by arraycopy:

    arraycopy(Object src, int srcPos, Object dest, int destPos, int length)

  3. The performance difference here is going to negligible unless the caller repeatedly calls your utility functions. If they're just one off things, there should be no difference. If they are called repeatedly than you should have the caller send you an allocated array.

like image 141
debracey Avatar answered Sep 23 '22 08:09

debracey


Assuming that you are working with arrays small enough to be allocated on the stack or in Eden, allocation is extremely fast. Therefore, there is no harm in having functions allocate their own arrays to return results. Doing this is a big win for readability.

I would suggest starting out making your functions operate on whole arrays, and introduce an option to call a function with just a slice of an array only if you find out that it is useful.

like image 35
Russell Zahniser Avatar answered Sep 26 '22 08:09

Russell Zahniser