Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding order of elements in stream generated from HashSet

I read this Java 8 official docs:

Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not.
If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result; if it is not ordered, repeated execution might produce different results.

Tried to understand the mentioned behaviour through this code

public class StreamOrderValidator
{
    public static void main( String[] args )
    {
        String[] colors=new String[] {"red","green","blue","orange"};
        List<String> colorsList=Arrays.asList(colors);

        HashSet<String> colorsSet=new HashSet<>();
        colorsSet.addAll(colorsList);
        System.out.println(colorsSet);            // [red, orange, green, blue]

        List<String> processedColorsSet = processStream(colorsSet.stream());
        System.out.println(processedColorsSet);   // [RED, ORANGE, GREEN, BLUE]
    }

    private static List<String> processStream(Stream<String> colorStream) {
        List<String> processedColorsList = colorStream.filter(s->s.length()<=6).
                map(String::toUpperCase).collect(Collectors.toList());
        return processedColorsList;
    }
}

I ran this code many a times and the order of elements in resulting stream was always the same (shown as comment). I am not able to figure it out how this justifies the above quoted text about "Order not being preserved for an unordered collection".

I am definitely misunderstanding the extracted text from javadocs.

like image 511
user2653926 Avatar asked Aug 27 '17 17:08

user2653926


3 Answers

There is a little bit of mis-understating here indeed. A HashSet or any Set is not about order, unless TreeSet that is ordered based on a Comparator.

At the moment, under java-8 once you put elements into a HashSet (and don't change it) - there will be an order of how the elements are laid out; but again, under the condition that you do not add or remove any of them. This can change at any point in time, so don't rely on it.

For example running this:

 String[] colors = new String[] { "red", "green", "blue", "orange" };
 List<String> colorsList = Arrays.asList(colors);

 HashSet<String> colorsSet = new HashSet<>();
 colorsSet.addAll(colorsList);
 System.out.println(colorsSet);

No matter how many times under java-8 at the moment you will always get the same output:

[red, orange, green, blue]

But once you do some internal re-shuffling:

    for (int i = 0; i < 1000; ++i) {
        colorsSet.add("" + i);
    }

    for (int i = 0; i < 1000; ++i) {
        colorsSet.remove("" + i);
    }   


    System.out.println(colorsSet); // [blue, red, green, orange]

You can see that the output changes, because Sets don't have an order. The key point to take is that there is no order, the fact that you do see an order is not a guarantee to happen every time - there might be a build in java-8 that will break this order. And in fact that is easily observable with java-9 for example - where there is a randomization pattern for new Sets.

If you run this multiple times, the result will differ:

 Set<String> set = Set.of("red", "green", "blue", "orange");
 System.out.println(set);

So obviously of you stream from such a Set the order will not be guaranteed and thus you will indeed see different results from run to run.

like image 188
Eugene Avatar answered Oct 17 '22 21:10

Eugene


What you're seeing is basically luck that the HashSet you're streaming returns values in sequential order. If you added enough values over time, you would eventually see different results from the stream due to the HashSet's underlying HashMap having to resize itself and re-order.

What you've supplied (the four colors) would, by chance, return the same results each time since there's no need for the underlying HashMap to resize itself and re-order values around.

Bearing in mind that a HashSet is backed by a HashMap per the java API docs, this question and its accepted answer covers what you're seeing by virtue of explaining a HashMap's behavior:

Order of values retrieved from a HashMap

like image 30
Sam Abazly Avatar answered Oct 17 '22 22:10

Sam Abazly


repeated execution might produce different results.

There's this might word. Even if it does not guarantee the order, it doesn't mean that order will be random every time. Elements are placed based on the hashcode. Try some different values :

    String[] colors=new String[] {"5reegdfg","fsdfsd6546","fsdfxvc4","77ggg"};
    List<String> colorsList=Arrays.asList(colors);

    HashSet<String> intSet =new HashSet<>();
    intSet.addAll(colorsList);


    intSet.forEach(e -> System.out.print(e + " "));

    System.out.println();
    intSet.add("fvcxbxb78ok");


    intSet.forEach(e -> System.out.print( e + " "));

The output is this :

fsdfxvc4 5reegdfg 77ggg fsdfsd6546 
fsdfxvc4 fvcxbxb78ok 5reegdfg 77ggg fsdfsd6546 

As you can see, the order is different in this example.

like image 44
Schidu Luca Avatar answered Oct 17 '22 22:10

Schidu Luca