Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stream: Filter on children, return the parent

Assume a class MyClass:

public class MyClass {

    private final Integer myId;
    private final String myCSVListOfThings;

    public MyClass(Integer myId, String myCSVListOfThings) {
        this.myId = myId;
        this.myCSVListOfThings = myCSVListOfThings;
    }

    // Getters, Setters, etc
}

And this Stream:

final Stream<MyClass> streamOfObjects = Stream.of(
        new MyClass(1, "thing1;thing2;thing3"),
        new MyClass(2, "thing2;thing3;thing4"),
        new MyClass(3, "thingX;thingY;thingZ"));

I want to return every instance of MyClass that contains an entry "thing2" in myCSVListOfThings.

If I wanted a List<String> containing myCSVListOfThings this could be done easily:

List<String> filteredThings = streamOfObjects
        .flatMap(o -> Arrays.stream(o.getMyCSVListOfThings().split(";")))
        .filter("thing2"::equals)
        .collect(Collectors.toList()); 

But what I really need is a List<MyClass>.

This is what I have right now:

List<MyClass> filteredClasses = streamOfObjects.filter(o -> {
     Stream<String> things = Arrays.stream(o.getMyCSVListOfThings().split(";"));
     return things.anyMatch(s -> s.equals("thing2"));
}).collect(Collectors.toList());

But somehow it does not feel right. Any cleaner solution than opening a new Stream inside of a Predicate?

like image 291
Anthony Accioly Avatar asked Aug 16 '16 11:08

Anthony Accioly


3 Answers

Firstly, I recommend you to add extra method to MyClass public boolean containsThing(String str), so you can transform you code like this:

List<MyClass> filteredClasses = streamOfObjects
  .filter(o -> o.containsThing("thing2"))
  .collect(Collectors.toList());

Now you can implement this method as you want depends on input data: splitting into Stream, splitting into Set, even searching of substring (if it's possible and has sense), caching result if you need.

You know much more about usage of this class so you can make right choice.

like image 86
Sergii Lagutin Avatar answered Oct 20 '22 04:10

Sergii Lagutin


One solution is to use a pattern matching that avoids the split-and-stream operation:

Pattern p=Pattern.compile("(^|;)thing2($|;)");
List<MyClass> filteredClasses = streamOfObjects
    .filter(o -> p.matcher(o.getMyCSVListOfThings()).find())
    .collect(Collectors.toList());

Since the argument to String.split is defined as regex pattern, the pattern above has the same semantic as looking for a match within the result of split; you are looking for the word thing2 between two boundaries, the first is either, the beginning of the line or a semicolon, the second is either, the end of the line or a semicolon.

Besides that, there is nothing wrong with using another Stream operation within a predicate. But there are some ways to improve it. The lambda expression gets more concise if you omit the obsolete local variable holding the Stream. Generally, you should avoid holding Stream instances in local variables as chaining the operations directly will reduce the risk of trying to use a Stream more than one time. Second, you can use the Pattern class to stream over the resulting elements of a split operation without collecting them all into an array first:

Pattern p=Pattern.compile(";");
List<MyClass> filteredClasses = streamOfObjects
    .filter(o -> p.splitAsStream(o.getMyCSVListOfThings()).anyMatch("thing2"::equals))
    .collect(Collectors.toList());

or

Pattern p=Pattern.compile(";");
List<MyClass> filteredClasses = streamOfObjects
    .filter(o -> p.splitAsStream(o.getMyCSVListOfThings()).anyMatch(s->s.equals("thing2")))
    .collect(Collectors.toList());

Note that you could also rewrite your original code to

List<MyClass> filteredClasses = listOfObjects.stream()
    .filter(o -> Arrays.asList(o.getMyCSVListOfThings().split(";")).contains("thing2"))
    .collect(Collectors.toList());

Now, the operation within the predicate is not a Stream but a Collection operation, but this doesn’t change the semantic nor the correctness of the code…

like image 2
Holger Avatar answered Oct 20 '22 04:10

Holger


As I see it you have three options.

1) look for particular entry in the String without spliting it - still looks messy

List<MyClass> filteredClasses = streamOfObjects
              .filter(o -> o.getMyCSVListOfThings().contains(";thing2;"))
              .collect(Collectors.toList());

2) map twice - still messy

List<MyClass> filteredClasses = streamOfObjects
              .map(o -> Pair<MyClass, List<String>>.of(o, toList(o.getMyCSVListOfThings()))
              .filter(pair -> pair.getRight().contains("thing2"))
              .map(pair -> pair.getLeft())
              .collect(Collectors.toList());

where toList is a method that will convert String to List

3) create additional field - method I'd suggest

Extend class MyClass - add field to the class

List<String> values;

And initialize it in the constructor:

public MyClass(Integer myId, String myCSVListOfThings) {
    this.myId = myId;
    this.myCSVListOfThings = myCSVListOfThings;
    this.values = toList(myCSVListOfThings);
}

And then in the stream simply:

List<MyClass> filteredClasses = streamOfObjects
          .filter(o -> o.getValues().contains("thing2"))
          .collect(Collectors.toList());

Of course field values can be initialized in LAZY mode during first getValues method call if you want.

like image 1
jchmiel Avatar answered Oct 20 '22 02:10

jchmiel