I am working with demographic data. I have a collection of records about the different counties of a state (several records per county) that I want to aggregate by county.
I have implemented the following Consumer:
public class CountyPopulation implements java.util.function.Consumer<Population>
{
private String countyId ;
private List<Demographic> demographics ;
public CountyPopulation()
{
demographics = new ArrayList<Demographic>() ;
}
public List<Demographic> getDemographics()
{
return demographics ;
}
public void accept(Population pop)
{
if ( countyId == null )
{
countyId = pop.getCtyId() ;
}
demographics.add( pop.getDemographic() ) ;
}
public void combine(CountyPopulation other)
{
demographics.addAll( other.getDemographics() ) ;
}
}
This CountyPopulation is used to aggregate data about a specific county using the following code (where "089" is a county identifier):
CountyPopulation ctyPop = populations
.stream()
.filter( e -> "089".equals( e.getCtyId() ) )
.collect(CountyPopulation::new,
CountyPopulation::accept,
CountyPopulation::combine) ;
Now, I would like remove the "filter" and group the records by county before using my aggregator.
Based on your first answers, I understand that this can be done using the static function Collector.of in the following way:
Map<String,CountyPopulation> pop = populations
.stream()
.collect(
Collectors.groupingBy(Population::getCtyId,
Collector.of( CountyPopulation::new,
CountyPopulation::accept,
(a,b)->{a.combine(b); return a;} ))) ;
However this code does not work because Collector.of() has a different signature than collect(). I suspect that the solution involves modifying class CountyPopulation so that it implements java.util.function.BiConsumer instead of java.util.function.Consumer but my attempt todo so hasn't worked and I am not clear why.
Calling collect with the three arguments on a Stream is equivalent to use Collector.of.
So you can achieve your goal using:
Map<String,CountyPopulation> pop = populations.stream().collect(
Collectors.groupingBy(Population::getCtyId, Collector.of(
CountyPopulation::new, CountyPopulation::accept, CountyPopulation::combine))) ;
For better parallel performance, it’s worth studying the optional Characteristics you can provide. If either or both of UNORDERED or CONCURRENT match the behavior of your CountyPopulation class, you may provide them (IDENTITY_FINISH is implied in your case).
And using groupingByConcurrent instead of groupingBy may also improve parallel performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With