Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most performant way to subtract one array from another

I have the following code which is the bottleneck in one part of my application. All I do is subtract on Array from another. Both of these arrays have more around 100000 elements. I'm trying to find a way to make this more performant.

var
  Array1, Array2 : array of integer;

..... 
// Code that fills the arrays
.....

for ix := 0 to length(array1)-1
  Array1[ix] := Array1[ix] - Array2[ix];

end;

Does anybody have a suggestion?

like image 392
Michael Küller Avatar asked Feb 15 '11 19:02

Michael Küller


People also ask

How do you subtract one array from another?

C = A – B subtracts array B from array A by subtracting corresponding elements. The sizes of A and B must be the same or be compatible. If the sizes of A and B are compatible, then the two arrays implicitly expand to match each other.

How do you subtract one array from another array in Python?

subtract() function is used when we want to compute the difference of two array.It returns the difference of arr1 and arr2, element-wise. Parameters : arr1 : [array_like or scalar]1st Input array. arr2 : [array_like or scalar]2nd Input array.

How do I subtract one NumPy array from another?

The most straightforward way to subtract two matrices in NumPy is by using the - operator, which is the simplification of the np. subtract() method - NumPy specific method designed for subtracting arrays and other array-like objects such as matrices.

Which subtracts one list from another and returns the result Javascript?

Implement a difference function, which subtracts one list from another and returns the result. It should remove all values from list a , which are present in list b keeping their order. Before reading the solution below, try to solve Array.


2 Answers

Running this on multiple threads, with that big an array will net linear speed-up. It's embarrassingly parallel as they say.

like image 129
David Heffernan Avatar answered Oct 16 '22 17:10

David Heffernan


Running subtraction on more threads sounds good, but 100K integer sunstraction don't take a lot of CPU time, so maybe threadpool... However settings threads have also a lot of overhead, so short arrays will have slower productivity in parallel threads than in only one (main) thread!

Did you switch off in compiler settings, overflow and range checking?

You can try to use asm rutine, it is very simple...

Something like:

procedure SubArray(var ar1, ar2; length: integer);
asm
//length must be > than 0!
   push ebx
   lea  ar1, ar1 -4
   lea  ar2, ar2 -4
@Loop:
   mov  ebx, [ar2 + length *4]
   sub  [ar1 + length *4], ebx
   dec  length
//Here you can put more folloving parts of rutine to more unrole it to speed up.
   jz   @exit
   mov  ebx, [ar2 + length *4]
   sub  [ar1 + length *4], ebx
   dec  length
//
   jnz  @Loop
@exit:
   pop  ebx
end;


begin
   SubArray(Array1[0], Array2[0], length(Array1));

It can be much faster...

EDIT: Added procedure with SIMD instructions. This procedure request SSE CPU support. It can take 4 integers in XMM register and subtract at once. There is also possibility to use movdqa instead movdqu it is faster, but you must first to ensure 16 byte aligment. You can also unrole the XMM par like in my first asm case. (I'm interesting about speed measurment. :) )

var
  array1, array2: array of integer;

procedure SubQIntArray(var ar1, ar2; length: integer);
asm
//prepare length if not rounded to 4
  push     ecx
  shr      length, 2
  jz       @LengthToSmall
@Loop:
  movdqu   xmm1, [ar1]          //or movdqa but ensure 16b aligment first
  movdqu   xmm2, [ar2]          //or movdqa but ensure 16b aligment first
  psubd    xmm1, xmm2
  movdqu   [ar1], xmm1          //or movdqa but ensure 16b aligment first
  add      ar1, 16
  add      ar2, 16
  dec      length
  jnz      @Loop
@LengthToSmall:
  pop      ecx
  push     ebx
  and      ecx, 3
  jz       @Exit
  mov      ebx, [ar2]
  sub      [ar1], ebx
  dec      ecx
  jz       @Exit
  mov      ebx, [ar2 + 4]
  sub      [ar1 + 4], ebx
  dec      ecx
  jz       @Exit
  mov      ebx, [ar2 + 8]
  sub      [ar1 + 8], ebx
@Exit:
  pop      ebx
end;

begin
//Fill arrays first!
  SubQIntArray(Array1[0], Array2[0], length(Array1));
like image 9
GJ. Avatar answered Oct 16 '22 18:10

GJ.