Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ArrayList remove vs removeAll

What is better to use, if i want to remove a collection from an arraylist? I think the removeAll method in ArrayList is written for this task, but in a test I wrote, just iterating through the objects and removing them individual was a few seconds faster.

What are you using for this purpose?

edit:

the code of removeAll i found on grepcode calls batchRemove (c, false):

private boolean More ...batchRemove(Collection c, boolean complement) {

700         final Object[] elementData = this.elementData;
701         int r = 0, w = 0;
702         boolean modified = false;
703         try {
704             for (; r < size; r++)
705                 if (c.contains(elementData[r]) == complement)
706                     elementData[w++] = elementData[r];
707         } finally {
708             // Preserve behavioral compatibility with AbstractCollection,
709             // even if c.contains() throws.
710             if (r != size) {
711                 System.arraycopy(elementData, r,
712                                  elementData, w,
713                                  size - r);
714                 w += size - r;
715             }
716             if (w != size) {
717                 // clear to let GC do its work
718                 for (int i = w; i < size; i++)
719                     elementData[i] = null;
720                 modCount += size - w;
721                 size = w;
722                 modified = true;
723             }
724         }
725         return modified;
726     }

i actually dont understand it..

my test code was this:

public class RemoveVsRemovall {

    public static void main(String[] args){
        ArrayList<String> source = new ArrayList<>();
        ArrayList<String> toRemove = new ArrayList<>();
        for(int i = 0; i < 30000; i++){
            String s = String.valueOf(System.nanoTime());
            source.add(s);
            if(i % 2 == 0) toRemove.add(s);
        }
        long startTime = System.nanoTime();
        removeList1(source, toRemove);
        long endTime = System.nanoTime();
        System.out.println("diff: " + (endTime - startTime) * 1e-9);
    }

    static void removeList1(ArrayList<String> source, ArrayList<String> toRemove){
        source.removeAll(toRemove);
    }

    static void removeList2(ArrayList<String> source, ArrayList<String> toRemove){
        for(String s : toRemove){
            source.remove(s);
        }
    }
}

calling it a few times with different list sizes and switching beteween the two methods.

like image 330
T_01 Avatar asked Mar 01 '15 12:03

T_01


People also ask

What is the difference between ArrayList Clear () and removeAll () methods?

clear() deletes every element from the collection and removeAll() one only removes the elements matching those from another Collection.

Which method is used to remove all the elements from any list remove () removeAll () Clear () None?

The Java ArrayList removeAll() method removes all the elements from the arraylist that are also present in the specified collection. The syntax of the removeAll() method is: arraylist. removeAll(Collection c);

What is removeAll in Java?

The removeAll() method of java. util. ArrayList class is used to remove from this list all of its elements that are contained in the specified collection. Syntax: public boolean removeAll(Collection c) Parameters: This method takes collection c as a parameter containing elements to be removed from this list.

What happens to an ArrayList when you remove?

If you call ArrayList. remove() the element at the given index is removed from the list (and all elements following it move down one index). The object itself still exists, no memory has been freed yet.


1 Answers

There are several reason it's hard to give a general answer to that question.

First, you have to understand that these performance characteristics are implementation-dependent. It's quite possible that the implementation varies depending on the platform and version of the JDK.

Having said that, there are mostly 2 strategies for implementing removeAll:

  1. For each element of your ArrayList, check if it is in the other Collection; if so, remove it.
  2. For each element of the Collection, check if it is in the ArrayList; if so, remove it.

If the Collection performs contains in constant-time, strategy 1 (asymptotically) wins. On the other hand, if contains is performed by scanning the whole connection and Collection iterates very slowly, strategy 2 generally has an edge, because it iterates on Collection only once; but even in that case, if the Collection is very big and most of the elements of the ArrayList are among the first elements of the Collection, strategy 1 wins again... there's no end to it.

You're probably better off trusting the implementation of removeAll(); if that fails, try changing data structures; and if that fails too, implement your own method from empirical benchmarks.

like image 164
Valentin Waeselynck Avatar answered Oct 04 '22 01:10

Valentin Waeselynck