I am trying to do the following,
JavaPairRDD<JsonObject, JsonObject> rdd1 = ..
JavaPairRDD<JsonObject, String> rdd2 = ..
JavaPairRDD<JsonObject, Tuple2<Iterable<String>, Iterable<JsonObject>>>
groupedRDD = rdd1.groupWith(rdd2);
But I'm not sure how Spark will compare two JsonObject keys.
More generally, how are keys compared when doing a join or groupWith?
It uses the Java .equals() method.
The thing is equals() is not implemented in JsonObject. So it will use the default Java implementation which compares just object references.
The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With