I have a small implementation detail question that I fail to understand in <code>ArrayList::removeIf</code>. I don't think I can simply put it the way it is without some preconditions first. As such: the implementation is basically a bulk <code>remove</code>, unlike <code>ArrayList::remove</code>. An example should make things a lot easier to understand. Let's say I have this list: <pre class="prettyprint"><code>List<Integer> list = new ArrayList<>(); // 2, 4, 6, 5, 5 list.add(2); list.add(4); list.add(6); list.add(5); list.add(5); </code></pre> And I would like to remove every element that is even. I could do: <pre class="prettyprint"><code>Iterator<Integer> iter = list.iterator(); while (iter.hasNext()) { int elem = iter.next(); if (elem % 2 == 0) { iter.remove(); } } </code></pre> Or : <pre class="prettyprint"><code>list.removeIf(x -> x % 2 == 0); </code></pre> The result will be the same, but the implementation is very different. Since the <code>iterator</code> is a view of the <code>ArrayList</code>, every time I call <code>remove</code>, the underlying <code>ArrayList</code> has to be brought to a "good" state, meaning that the inner array will actually change. Again, on every single call of <code>remove</code>, there will be calls <code>System::arrayCopy</code> internally. On the contrast <code>removeIf</code> is smarter. Since it does the iteration internally, it can make things more optimized. The way it does this is interesting. It first computes the indexes where the elements are supposed to be removed from. This is done by first computing a tiny <code>BitSet</code>, an array of <code>long</code> values where at each index, resides a <code>64 bit</code> value (a <code>long</code>). Multiple <code>64 bit</code> values make this a <code>BitSet</code>. To set a value at a particular offset, you first need to find out the index in the array and then set the corresponding bit. This is not very complicated. Let's say you want to set bit 65 and 3. First we need a <code>long [] l = new long[2]</code> (because we went beyond 64 bits, but not more than 128): <pre class="prettyprint"><code>|0...(60 more bits here)...000|0...(60 more bits here)...000| </code></pre> You first find the index : <code>65 / 64</code> (they actually do <code>65 >> 6</code>) and then in that index (<code>1</code>) put the needed bit: <pre class="prettyprint"><code>1L << 65 // this will "jump" the first 64 bits, so this will actually become 00000...10. </code></pre> Same thing for <code>3</code>. As such that long array will become: <pre class="prettyprint"><code>|0...(60 more bits here)...010|0...(60 more bits here)...1000| </code></pre> In source code they call this BitSet - <code>deathRow</code> (nice name!). <hr> Let's take that <code>even</code> example here, where <code>list = 2, 4, 6, 5, 5</code> <ul> <li>they iterate the array and compute this <code>deathRow</code> (where <code>Predicate::test</code> is <code>true</code>).</li> </ul> <blockquote> deathRow = 7 (000... 111) meaning indexes = [0, 1, 2] are to be removed </blockquote> <ul> <li>they now replace elements in the underlying array based on that deathRow (not going into the details how this is done)</li> </ul> <blockquote> inner array becomes : [5, 5, 6, 5, 5]. Basically they move the elements that are supposed to remain in front of the array. </blockquote> <hr> I can finally bring in the question. At this point in time, they know: <pre class="prettyprint"><code> w -> number of elements that have to remain in the list (2) es -> the array itself ([5, 5, 6, 5, 5]) end -> equal to size, never changed </code></pre> To me, there is a single step to do here : <pre class="prettyprint"><code>void getRidOfElementsFromWToEnd() { for(int i=w; i<end; ++i){ es[i] = null; } size = w; } </code></pre> Instead, this happens: <pre class="prettyprint"><code>private void shiftTailOverGap(Object[] es, int w, int end) { System.arraycopy(es, end, es, w, size - end); for (int to = size, i = (size -= end - w); i < to; i++) es[i] = null; } </code></pre> I've renamed the variables on purpose here. What is the point in calling: <pre class="prettyprint"><code> System.arraycopy(es, end, es, w, size - end); </code></pre> Especially <code>size - end</code>, since <code>end</code> is <code>size</code> all the time - it is never changed (so this is always <code>zero</code>). This is basically a NO-OP here. What corner case am I missing here?

You are looking at the specific (common) case that the list, you call <code>removeIf</code> on, is the same as the <code>ArrayList</code>. Only in this case, you can assume that <code>end</code> is always equal to <code>size</code>. A counter-example would be: <pre class="prettyprint"><code>ArrayList<Integer> l = new ArrayList<>(List.of(1, 2, 3, 4, 5, 6, 7)); l.subList(2, 5).removeIf(i -> i%2 == 1); </code></pre> Likewise, <code>removeAll</code> will call <code>shiftTailOverGap</code> with an <code>end</code> argument which can differ from <code>size</code> when being applied to a <code>subList</code>. A similar situation arises when you call <code>clear()</code>. In that case, the actual operation, performed when calling it on the <code>ArrayList</code> itself, is so trivial that it does not even calls the <code>shiftTailOverGap</code> method. Only when using something like <code>l.subList(a, b).clear()</code>, it’ll end up at <code>removeRange(a, b)</code> on <code>l</code>, which will in turn, as you already found out yourself, invoke <code>shiftTailOverGap(elementData, a, b)</code> with a <code>b</code>which can be smaller than <code>size</code>.

removeIf implementation detail

Tags:

java

iterator

java-8

I have a small implementation detail question that I fail to understand in ArrayList::removeIf. I don't think I can simply put it the way it is without some preconditions first.

As such: the implementation is basically a bulk remove, unlike ArrayList::remove. An example should make things a lot easier to understand. Let's say I have this list:

Click to copy

List<Integer> list = new ArrayList<>(); // 2, 4, 6, 5, 5
list.add(2);
list.add(4);
list.add(6);
list.add(5);
list.add(5);

And I would like to remove every element that is even. I could do:

Click to copy

Iterator<Integer> iter = list.iterator();
while (iter.hasNext()) {
    int elem = iter.next();
    if (elem % 2 == 0) {
         iter.remove();
    }
}

Or :

Click to copy

list.removeIf(x -> x % 2 == 0);

The result will be the same, but the implementation is very different. Since the iterator is a view of the ArrayList, every time I call remove, the underlying ArrayList has to be brought to a "good" state, meaning that the inner array will actually change. Again, on every single call of remove, there will be calls System::arrayCopy internally.

On the contrast removeIf is smarter. Since it does the iteration internally, it can make things more optimized. The way it does this is interesting.

It first computes the indexes where the elements are supposed to be removed from. This is done by first computing a tiny BitSet, an array of long values where at each index, resides a 64 bit value (a long). Multiple 64 bit values make this a BitSet. To set a value at a particular offset, you first need to find out the index in the array and then set the corresponding bit. This is not very complicated. Let's say you want to set bit 65 and 3. First we need a long [] l = new long[2] (because we went beyond 64 bits, but not more than 128):

Click to copy

|0...(60 more bits here)...000|0...(60 more bits here)...000|

You first find the index : 65 / 64 (they actually do 65 >> 6) and then in that index (1) put the needed bit:

Click to copy

1L << 65 // this will "jump" the first 64 bits, so this will actually become 00000...10.

Same thing for 3. As such that long array will become:

Click to copy

|0...(60 more bits here)...010|0...(60 more bits here)...1000|

In source code they call this BitSet - deathRow (nice name!).

Let's take that even example here, where list = 2, 4, 6, 5, 5

they iterate the array and compute this deathRow (where Predicate::test is true).

deathRow = 7 (000... 111)

meaning indexes = [0, 1, 2] are to be removed

they now replace elements in the underlying array based on that deathRow (not going into the details how this is done)

inner array becomes : [5, 5, 6, 5, 5]. Basically they move the elements that are supposed to remain in front of the array.

I can finally bring in the question.

At this point in time, they know:

Click to copy

 w   ->  number of elements that have to remain in the list (2)
 es  ->  the array itself ([5, 5, 6, 5, 5])
 end ->  equal to size, never changed

To me, there is a single step to do here :

Click to copy

void getRidOfElementsFromWToEnd() {
    for(int i=w; i<end; ++i){
       es[i] = null;
    }
    size = w;
}

Instead, this happens:

Click to copy

private void shiftTailOverGap(Object[] es, int w, int end) {
    System.arraycopy(es, end, es, w, size - end);
    for (int to = size, i = (size -= end - w); i < to; i++)
        es[i] = null;
}

I've renamed the variables on purpose here.

What is the point in calling:

Click to copy

 System.arraycopy(es, end, es, w, size - end);

Especially size - end, since end is size all the time - it is never changed (so this is always zero). This is basically a NO-OP here. What corner case am I missing here?

578

asked Feb 03 '20 22:02

Eugene

1 Answers

You are looking at the specific (common) case that the list, you call removeIf on, is the same as the ArrayList. Only in this case, you can assume that end is always equal to size.

A counter-example would be:

Click to copy

ArrayList<Integer> l = new ArrayList<>(List.of(1, 2, 3, 4, 5, 6, 7));
l.subList(2, 5).removeIf(i -> i%2 == 1);

Likewise, removeAll will call shiftTailOverGap with an end argument which can differ from size when being applied to a subList.

A similar situation arises when you call clear(). In that case, the actual operation, performed when calling it on the ArrayList itself, is so trivial that it does not even calls the shiftTailOverGap method. Only when using something like l.subList(a, b).clear(), it’ll end up at removeRange(a, b) on l, which will in turn, as you already found out yourself, invoke shiftTailOverGap(elementData, a, b) with a bwhich can be smaller than size.

154

answered Oct 01 '22 18:10

Holger

Related questions
                            
                                Proxy Authentication with JDK 11 HttpClient
                            
                                Issue in adding Firebase In-App Messaging
                            
                                forming a specific list with Java 8 streams
                            
                                Generic object comparison method with a variable number of method references for comparison
                            
                                Gradle error Android Resource linking Failed
                            
                                Error: Build path specifies execution environment JavaSE-10. There are no JREs installed in the workspace that are strictly compatible.
                            
                                Handling Display Size change
                            
                                Can't interrupt tasks of ExecutorService
                            
                                Can we create users in Keycloak by sending a json array containing more than 2 user info?
                            
                                Why does Float.parseFloat() throw both NumberFormatException and NullPointerException but Integer.parseInt() only throws NumberFormatException?
                            
                                Why do you extend Serializable in Scala?
                            
                                Running configurations with a limit
                            
                                Kafka keeps rebalancing consumers
                            
                                Merge views with an existing video
                            
                                Why JavaFX (3D) on Raspberry Pi doesn't work, although it should?
                            
                                Full volatile Visibility Guarantee
                            
                                How to use file associations with jpackage?
                            
                                Java AWT package "not accessible" in eclipse [duplicate]
                            
                                Wildfly in Docker container not starting
                            
                                Swagger/OpenAPI annotations V3 - use Enum values in swagger annotations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

removeIf implementation detail

Tags:

java

iterator

java-8

Eugene

People also ask

1 Answers

Holger

Recent Activity

Donate For Us