I have a list of lists:
List<ArrayList<String>> D = new ArrayList<>();
When it's populated, it might look like:
["A", "B", "Y"]
["C", "D", "Y"]
["A", "D", "N"]
I want to split the list of lists into partitions based on the unique attribute values (let's say index 1).
So the attribute at index 1 has two unique values, "B" and "D", so I want to split into:
["A", "B", "Y"]
["C", "D", "Y"]
["A", "D", "N"]
and put those into a List<ArrayList<ArrayList<String>>> sublists;
Is there a smart way of doing this, or do I just do something like this:
List<ArrayList<ArrayList<String>>> sublists = new ArrayList<>();
int featIdx = 1;
// generate the subsets
for (ArrayList<String> record : D) {
String val = record.get(featIdx);
// check if the value exists in sublists
boolean found = false;
for (ArrayList<ArrayList<String>> entry : sublists) {
if (entry.get(0).get(featIdx).equals(val)) {
entry.add(record);
found = true;
break;
}
}
if (!found) {
sublists.add(new ArrayList<>());
sublists.get(sublists.size()-1).add(record);
}
}
This is a step from the C4.5 Decision Tree algorithm, so if anyone has experience in this, I would appreciate if you could let me know if this is the right approach to generating the sublists.
Thank you.
With Java 8 you can use the groupingBy
collector:
Map<String, List<List<String>>> grouped = D.stream()
.collect(Collectors.groupingBy(list -> list.get(1)));
Collection<List<List<String>>> sublists = grouped.values();
or as suggested by @AlexisC:
import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.groupingBy;
Collection<List<List<String>>> sublists = D.stream()
.collect(collectingAndThen(groupingBy(list -> list.get(1)), Map::values));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With