Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split list of words using streams in Java

I am having this method that takes a number of lists, which contain lines of books. I am combing them to a stream to then iterate over them to split on all non-letter's \\P{L}.

Is there a way to avoid the for-each loop and process this within a stream?

private List<String> getWordList(List<String>... lists) {
        List<String> wordList = new ArrayList<>();

        Stream<String> combinedStream = Stream.of(lists)
                .flatMap(Collection::stream);
        List<String> combinedLists = combinedStream.collect(Collectors.toList());

        for (String line: combinedLists) {
            wordList.addAll(Arrays.asList(line.split("\\P{L}")));
        }

        return wordList;
}
like image 901
Hermann Stahl Avatar asked Dec 03 '22 18:12

Hermann Stahl


2 Answers

Having stream, you can simply "flatMap" further and return the result:

return combinedStream
        .flatMap(str -> Arrays.stream(str.split("\\P{L}")))
        .collect(Collectors.toList());

To put it altogether:

private List<String> getWordList(List<String>... lists) {
    return Stream.of(lists)
        .flatMap(Collection::stream)
        .flatMap(str -> Arrays.stream(str.split("\\P{L}")))
        .collect(Collectors.toList());
}
like image 117
Andronicus Avatar answered Dec 21 '22 22:12

Andronicus


You don't need to introduce so many variables :

private List<String> getWordList(List<String>... lists) {

    return Stream.of(lists) // Stream<Stream<String>>
                 .flatMap(Collection::stream) // Stream<String> 
                 .flatMap(Pattern.compile("\\P{L}")::splitAsStream) //Stream<String>     
                 .collect(toList()); // List<String>
}

As underlined by Holger, .flatMap(Pattern.compile("\\P{L}")::splitAsStream)
should be favored over .flatMap(s -> Arrays.stream(s.split("\\P{L}"))) to spare array allocation and pattern compilation performed for each element of the stream.

like image 41
davidxxx Avatar answered Dec 21 '22 23:12

davidxxx