I am trying to use Java 8 to rewrite the implementation of Moore’s Voting Algorithm to find the Majority Element in an array.
The Java 7 implementation will be something like this:
public int findCandidate(int[] nums) {
int maj_index = 0, count = 1;
for(int i=1; i<nums.length;i++){
if(count==0){
count++;
maj_index=i;
}else if(nums[maj_index]==nums[i]){
count++;
} else {
count--;
}
}
return nums[maj_index];
}
The method I can think of is using stream reduce to get the final result
public int findCandidate(int[] nums) {
int count = 1;
Arrays
.asList(nums)
.stream()
.reduce(0, (result, cur) -> {
if (count == 0) {
result = cur;
count++;
} else if (result == cur){
count++;
} else {
count --;
}
});
return result;
}
But this method have compile error, besides, it also break the functional purist, I encounter this situation many times, so what is the best way to deal with the global variable inside the lambda expression.
Reducing is the repeated process of combining all elements. reduce operation applies a binary operator to each element in the stream where the first argument to the operator is the return value of the previous application and second argument is the current stream element.
Using global variables causes very tight coupling of code. Using global variables causes namespace pollution. This may lead to unnecessarily reassigning a global value. Testing in programs using global variables can be a huge pain as it is difficult to decouple them when testing.
In Java, reducing is a terminal operation that aggregates a stream into a type or a primitive type. Java 8 provides Stream API contains set of predefined reduction operations such as average(), sum(), min(), max(), and count(). These operations return a value by combining the elements of a stream.
Yes, streams are sometimes slower than loops, but they can also be equally fast; it depends on the circumstances. The point to take home is that sequential streams are no faster than loops.
Yassin Hajaj's answer shows some pretty good streams techniques. (+1) Fundamentally I think it's using the right approach. There are a couple of minor improvements that can be made to it, though.
The first change is use the counting()
collector to count the items in each group instead of accumulating them into a list. Since we're looking for a majority, all we need is the count, not the actual elements, and we avoid having to compare the lengths of the lists.
The second change is to filter the list looking for a group whose count is a majority. By definition there can be at most one, so we just filter the map entries using this predicate and terminate the stream with findAny
instead of max
.
The third change is to have the function return OptionalInt
which more closely matches its intent. The OptionalInt
either contains the majority value, or it's empty if there isn't a majority value. This avoids having to use a sentinel value such as -1
that might actually occur in the data. Since findAny
returns an OptionalInt
, we're done.
Finally, I've relied on static imports in several places. This is mainly a matter of style, but I think it cleans up the code a bit.
Here's my variation:
static OptionalInt majority(int... nums) {
Map<Integer, Long> map =
Arrays.stream(nums)
.boxed()
.collect(groupingBy(x -> x, counting()));
return map.entrySet().stream()
.filter(e -> e.getValue() > nums.length / 2)
.mapToInt(Entry::getKey)
.findAny();
}
So just like I told you within my comment, it is not ok to use mutable objects within your lambda expressions. But in your case, if you really want to apply the same algorithm, it'll be difficult.
Here's one that will do the same as what you want, if no majority is found, it returns -1
public static int findCandidate(int ... nums) {
Map<Integer, List<Integer>> map =
Arrays.stream(nums)
.boxed()
.collect(Collectors.groupingBy(x -> x));
int value =
map
.entrySet().stream()
.max((e1, e2) -> Integer.compare(e1.getValue().size(), e2.getValue().size()))
.map(e -> e.getKey())
.get();
int result = map.get(value).size();
return result > nums.length / 2 ? value : -1;
}
Your problem here is that Java streams don't have a true list fold operation. With a true fold operation it's not too difficult to write the function as a left fold. For example, in Haskell:
import Data.List (foldl')
-- A custom struct to represent the state of the fold.
data MooreState a = MooreState { candidate :: a, count :: !Int }
findCandidate :: Eq a => [a] -> Maybe a
findCandidate (first:rest) = Just result
where
Moore result _ = foldl' combiner (MooreState first 1) rest
combiner :: Eq a => MooreState a -> a -> MooreState a
combiner (Moore candidate count) current
| count == 0 = MooreState current 1
| candidate == current = MooreState candidate (count + 1)
| otherwise = MooreState candidate (count - 1)
-- The empty list has no candidates.
findCandidate [] = Nothing
Java's reduce()
methods are the closest it has to a true left fold, but if you look at the Javadoc for the reduce()
method that you're using, you'll note that it says that:
This documentation is really hard to interpret, but the way I read it is this. Even though it might process elements out of order:
Why is this important? Well, first of all, the fact that you are mutating an external variable means that the way you're using the count
is broken. Element #7 of the stream may be processed before element #5, for all you know.
More insidiously, the combine
operation in the Haskell version above combines inputs of different types (Moore a
and a
), but the Java reduce
method that you use is based on a BinaryOperator<T>
, which combines two objects of the same type. There's another overload of reduce
that uses a BiFunction<U, T, U>
, but that requires you to supply a BinaryOperator<U> combiner
and its U identity
as well. This is because Java's reduce
methods are designed so that they can:
So the associativity and identity requirements are there to guarantee that this parallel processing produces the same result as if you did it sequentially. But this means that while there's a straightforward functional implementation of the algorithm, there is no straightforward way to write it with Java's Stream
type. (There is a non-straightforward way, but that gets into a bit of wizardry that would be (a) really convoluted and (b) really slow in Java.)
So I personally would just accept that Java's not a great functional programming language, leave good enough alone and use the imperative version as-is. But if for some weird reason I really insisted on doing it functionally, I'd go for a library like jOOλ which provides true left folds in Java. Then you can do the same as the Haskell solution (untested pseudocode):
import org.jooq.lambda.Seq;
import org.jooq.lambda.tuple.Tuple2;
class MooreState<A> {
private final A candidate;
private final int count;
// ...constructors and getters...
}
public static Optional<A> findCandidate(Seq<A> elements) {
Tuple2<Optional<A>, Seq<A>> split = elements.splitAtHead();
return split.v1().map(first -> {
Seq<A> rest = split.v2();
return rest.foldLeft(
new MooreState<>(first, 1),
(state, current) -> {
if (state.getCount() == 0) {
return new MooreState<>(current, 1);
} else if (state.getCandidate().equals(current) {
return new MooreState<>(state.getCandidate(),
state.getCount() + 1);
} else {
return new MooreState<>(state.getCandidate(),
state.getCount() - 1);
}
}
);
});
}
...which is probably terrifyingly slow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With