From what i understand, Set in java is an unordered collection and an iterator will process the items in some certain order of its choice(I might be wrong here) but ensures it process all the elements in the set.
In Java8, stream() API in Collections has been introduced with skip and limit functionality. So i want to know if the order of the items processed from stream remain same irrespective of how many times i start a stream or will it be random everytime ? Will the order change if the set gets modified in between the streams ?
May be irrelevant but i am providing the problem here : Now coming to the problem, I have a Set of size 2000 or something which wont be modified post creation and i am doing a batch operation for 50 involving a network call for each batch. I have a start param which gets incremented by 50 post every batch call. If i use a stream over my Set with "start" as the skip argument for every batch, it would be a new stream for every batch right ? So is the order of the stream gaurenteed to remain same. Obviously i dont wont the same entry multiple times and more importantly i dont wont to miss out on any entry. Simplest thing is for me to an Arraylist but i want to know if it is really required for me to create a set.
Let's start with an example here. First the obvious one I think:
List<String> wordList = Arrays.asList("just", "a", "test");
Set<String> wordSet = new HashSet<>(wordList);
System.out.println(wordSet);
for (int i = 0; i < 100; i++) {
wordSet.add("" + i);
}
for (int i = 0; i < 100; i++) {
wordSet.remove("" + i);
}
System.out.println(wordSet);
Output will show a different "order" - because we have made the capacity bigger (via 1-100
addition) and entries have moved. They are still 3 there - but in different order (if such can be called order).
So, yes, once you modify your Set
between stream operations, the "order" could change.
Since you say that post creation the Set
will not be modified - the order is preserved at the moment, under the current implementation (whatever that is). Or more accurately it is not internally randomized - once entries are laid into the Set
.
But this is absolutely something not to rely one - ever. Things can change without notice, since the contract is allowed to do that - the docs don't make any guarantees about any order what-so-ever - Set
s are about uniqueness after all.
To give you an example the jdk-9 Immutable Set
and Map
do have an internal randomization and the "order" will change from run to run:
Set<String> set = Set.of("just", "a", "test");
System.out.println(set);
This is allowed to print:
[a, test, just] or [a, just, test]
EDIT
Here is how the randomization pattern looks like:
/**
* A "salt" value used for randomizing iteration order. This is initialized once
* and stays constant for the lifetime of the JVM. It need not be truly random, but
* it needs to vary sufficiently from one run to the next so that iteration order
* will vary between JVM runs.
*/
static final int SALT;
static {
long nt = System.nanoTime();
SALT = (int)((nt >>> 32) ^ nt);
}
What this does:
take a long, XOR the first 32 bits with the last 32 bits and take the last 32 bits from that long (by casting to int). XOR is used because it has a 50% zeroes and ones distribution, so it does not alter the result.
How is that used(for a Set
of two elements for example):
// based on SALT set the elements in a particular iteration "order"
if (SALT >= 0) {
this.e0 = e0;
this.e1 = e1;
} else {
this.e0 = e1;
this.e1 = e0;
My guess on the jdk9 internal randomization part, initially taken from here, the relevant part:
The final safety feature is the randomized iteration order of the immutable Set elements and Map keys. HashSet and HashMap iteration order has always been unspecified, but fairly stable, leading to code having inadvertent dependencies on that order. This causes things to break when the iteration order changes, which occasionally happens. The new Set/Map collections change their iteration order from run to run, hopefully flushing out order dependencies earlier in test or development
So it's basically to break all that code that would rely on order for a Set
/Map
. The same thing happened when people moved from java-7 to java-8 and were relying on HashMap's order (LinkedNode
s), that was different due to TreeNode
s introduction. If you leave a feature like that and people rely on it for years - it's hard to remove it and perform some optimizations - like HashMap moved to TreeNode
s; because now you are forced to preserve that order, even if you don't want to. But that is just a guess obviously, treat it as such please
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With