Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance: Loop through ArrayList hundreds of times vs converting Arraylist to HashMap and Back?

I have two large (1000+ object) ArrayLists that I need to compare and manipulate. I essentially need to take a value from ArrayList A, look for a matching object in ArrayList B, then manipulate the object from B. I need to do this in all objects for A. I need to do this frequently in the application. Order is not known and sizes will differ.

(pseudocode)
ArrayList<myObject> A
ArrayList<myObject> B

I could loop through every single item in B looking for the one that matches the entity from A, for each entity in A. That just seems so inefficient.

(pseudocode)
for (each object in A){loop through all of B and find it}

Would it be worth converting B to a HashMap (using the specific value I am comparing as the key, and the object as the value), then searching B that way, then converting that temporary HashMap back to an ArrayList when I'm done processing?

(pseudocode)
convert B to HashMap<myObject.myValue,myObject> C
for (each object in A){look up the value in C}
convert C back to an ArrayList

Is this a good idea? Or is this premature/unnecessary optimization? Thank you.

(Background: Data comes to me from the service as an ArrayList - and the frontend needs an ArrayList for the view layer. I'm trying to make this middle tier processing more efficient - but the entry and exit objects must be ArrayList (or some other list) )

like image 487
K13 Avatar asked May 28 '19 23:05

K13


People also ask

Is HashMap faster or ArrayList?

The ArrayList has O(n) performance for every search, so for n searches its performance is O(n^2). The HashMap has O(1) performance for every search (on average), so for n searches its performance will be O(n). While the HashMap will be slower at first and take more memory, it will be faster for large values of n.

Which is best ArrayList or HashMap?

ArrayList allows duplicate elements. HashMap allows duplicate values but does not allow duplicate keys. The ArrayList always gives O(1) performance in best case or worst-case time complexity. The HashMap get() method has O(1) time complexity in the best case and O(n) time complexity in worst case.


1 Answers

Yes, for large numbers, a HashMap is beneficial.

Your initial algorithm will take a long time, looping through both lists in nested for loops. This is an O(n2) algorithm. Even assuming 1000 items each in A and B, and assuming a cost of 1 for comparing two individual items, one from A and one from B, you're looking at 500k comparisons (avoiding comparing each item twice). Doing this frequently will cause slow performance.

Assuming you have a good hash code algorithm for your objects, looking up a value from a HashMap is O(1) access. You'll still spend O(n) time constructing it, but that's nothing compared to O(n2) if you have a large number of items.

Construct your HashMap "C" once using data from "B" and use it many times, as long as B's information hasn't changed. If you "need to do this frequently", then the performance will be even better because the HashMap is already built -- just reuse it.

If you need to maintain the order, store the B list index as the value in the hash map.

You don't need to be "converting that temporary hashmap back to an arraylist", because creating the HashMap "C" doesn't destroy the original list "B". One thing to be careful of is if the objects in the list B change, forcing updates to the HashMap to keep consistent. Another thing to watch is your memory usage for very large lists -- can you keep the objects, the list, and the hashmap in memory?

Your pseudocode:

for each index in B:
    get object b
    put in hash map C values (b, index)

for each a in A:
    if found in hash map C: do something with found object

For smaller numbers, the O(n2) performance times will be small enough that building the HashMap won't be worth it. That is a decision you'll need to make -- you would need to decide when the lists are large enough that building a HashMap and using it is worth the HashMap construction cost.

like image 112
rgettman Avatar answered Nov 15 '22 18:11

rgettman