Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

creating top 5 aggregation with ramdajs

I would like to transform this input

[
        { country: 'France', value: 100 },
        { country: 'France', value: 100 },
        { country: 'Romania', value: 500 },
        { country: 'England', value: 400 },
        { country: 'England', value: 400 },
        { country: 'Spain', value: 130 },
        { country: 'Albania', value: 4 },
        { country: 'Hungary', value: 3 }
]

into the output

[
      { country: 'England', value: 800 },
      { country: 'Romania', value: 500 },
      { country: 'France', value: 200 },
      { country: 'Spain', value: 130 },
      { country: 'Other', value: 8 }
]

Which is basically doing a sum of values for the top 4 + others countries.

I am using javascript with ramdajs, and I only managed to do it in a somehow cumbersome way so far.

I am looking for an elegant solution: any Functional Programmer out there able to provide their solution? Or any idea of ramda methods that would help?

like image 427
Pierre-Jean Avatar asked May 08 '19 09:05

Pierre-Jean


3 Answers

(Each step gets the output of the previous step. Everything will be put together in the end.)

Step 1: Get a map of sums

You can transform this:

[
  { country: 'France', value: 100 },
  { country: 'France', value: 100 },
  { country: 'Romania', value: 500 },
  { country: 'England', value: 400 },
  { country: 'England', value: 400 },
  { country: 'Spain', value: 130 },
  { country: 'Albania', value: 4 },
  { country: 'Hungary', value: 3 }
]

into this:

{
  Albania: 4,
  England: 800,
  France: 200,
  Hungary: 3,
  Romania: 500,
  Spain: 130
}

With this:

const reducer = reduceBy((sum, {value}) => sum + value, 0);
const reduceCountries = reducer(prop('country'));

Step 2: Convert that back into a sorted array

[
  { country: "Hungary", value: 3 },
  { country: "Albania", value: 4 },
  { country: "Spain", value: 130 },
  { country: "France", value: 200 },
  { country: "Romania", value: 500 },
  { country: "England", value: 800 }
]

You can do this with:

const countryFromPair = ([country, value]) => ({country, value});
pipe(toPairs, map(countryFromPair), sortBy(prop('value')));

Step 3: Create two sub groups, the non-top-4 countries and the top-4 countries

[
  [
    { country: "Hungary", value: 3},
    { country: "Albania", value: 4}
  ],
  [
    { country: "Spain", value: 130 },
    { country: "France", value: 200 },
    { country: "Romania", value: 500 },
    { country: "England", value: 800 }
  ]
]

Which you can do with this:

splitAt(-4)

Step 4: Merge the first sub group

[
  [
    { country: "Others", value: 7 }
  ],
  [
    { country: "Spain", value: 130 },
    { country: "France", value: 200 },
    { country: "Romania", value: 500 },
    { country: "England", value: 800 }
  ]
]

With this:

over(lensIndex(0), compose(map(countryFromPair), toPairs, reduceOthers));

Step 5: Flatten the entire array

[
  { country: "Others", value: 7 },
  { country: "Spain", value: 130 },
  { country: "France", value: 200 },
  { country: "Romania", value: 500 },
  { country: "England", value: 800 }
]

With

flatten

Complete working example

const data = [
  { country: 'France', value: 100 },
  { country: 'France', value: 100 },
  { country: 'Romania', value: 500 },
  { country: 'England', value: 400 },
  { country: 'England', value: 400 },
  { country: 'Spain', value: 130 },
  { country: 'Albania', value: 4 },
  { country: 'Hungary', value: 3 }
];

const reducer = reduceBy((sum, {value}) => sum + value, 0);
const reduceOthers = reducer(always('Others'));
const reduceCountries = reducer(prop('country'));
const countryFromPair = ([country, value]) => ({country, value});

const top5 = pipe(
  reduceCountries,
  toPairs,
  map(countryFromPair),
  sortBy(prop('value')),
  splitAt(-4),
  over(lensIndex(0), compose(map(countryFromPair), toPairs, reduceOthers)),
  flatten
);

top5(data)
like image 67
customcommander Avatar answered Oct 19 '22 16:10

customcommander


Here's an approach:

const combineAllBut = (n) => pipe(drop(n), pluck(1), sum, of, prepend('Others'), of)

const transform = pipe(
  groupBy(prop('country')),
  map(pluck('value')),
  map(sum),
  toPairs,
  sort(descend(nth(1))),
  lift(concat)(take(4), combineAllBut(4)),
  map(zipObj(['country', 'value']))
)

const countries = [{ country: 'France', value: 100 }, { country: 'France', value: 100 }, { country: 'Romania', value: 500 }, { country: 'England', value: 400 }, { country: 'England', value: 400 }, { country: 'Spain', value: 130 }, { country: 'Albania', value: 4 }, { country: 'Hungary', value: 3 }]

console.log(transform(countries))
<script src="https://bundle.run/[email protected]"></script>
<script>
const {pipe, groupBy, prop, map, pluck, sum, of, prepend, toPairs, sort, descend, nth, lift, concat, take, drop, zipObj} = ramda
</script>

Except for the one complex line (lift(concat)(take(4), combineAllBut(4))) and the associated helper function (combineAllBut), this is a set of simple transformations. That helper function is probably not useful outside this function, so it would be perfectly acceptable to inline it as lift(concat)(take(4), pipe(drop(4), pluck(1), sum, of, prepend('Others'), of)), but I find the resulting function a little too difficult to read.

Note that that function will return something like [['Other', 7]], which is a format meaningless outside the fact that we're going to then concat it with an array of the top four. So there's at least some argument for removing the final of and replacing concat with flip(append). I didn't do so since that helper function means nothing except in context of this pipeline. But I would understand if someone would choose otherwise.

I like the rest of this function, and it seems to be a good fit for the Ramda pipeline style. But that helper function spoils it to some degree. I would love to hear suggestions for simplifying it.

Update

Then answer from customcommander demonstrated a simplification I could take, by using reduceBy instead of of the groupBy -> map(pluck) -> map(sum) dance in the above approach. That makes for a definite improvement.

const combineAllBut = (n) => pipe(drop(n), pluck(1), sum, of, prepend('Others'), of)

const transform = pipe(
  reduceBy((a, {value}) => a + value, 0, prop('country')),
  toPairs,
  sort(descend(nth(1))),
  lift(concat)(take(4), combineAllBut(4)),
  map(zipObj(['country', 'value']))
)

const countries = [{ country: 'France', value: 100 }, { country: 'France', value: 100 }, { country: 'Romania', value: 500 }, { country: 'England', value: 400 }, { country: 'England', value: 400 }, { country: 'Spain', value: 130 }, { country: 'Albania', value: 4 }, { country: 'Hungary', value: 3 }]

console.log(transform(countries))
<script src="https://bundle.run/[email protected]"></script>
<script>
const {pipe, reduceBy, prop, map, pluck, sum, of, prepend, toPairs, sort, descend, nth, lift, concat, take, drop, zipObj} = ramda
</script>
like image 3
Scott Sauyet Avatar answered Oct 19 '22 16:10

Scott Sauyet


I give it a try and try to use it's function for most things. and keep it single pipe

const f = pipe(
  groupBy(prop('country')),
  map(map(prop('value'))),
  map(sum),
  toPairs(),
  sortBy(prop(1)),
  reverse(),
  addIndex(map)((val, idx) => idx<4?val:['Others',val[1]]),
  groupBy(prop(0)),
  map(map(prop(1))),
  map(sum),
  toPairs(),
  map(([a,b])=>({'country':a,'value':b}))
)

Ramda REPL


However, I don't think it's any way readable.

like image 2
apple apple Avatar answered Oct 19 '22 17:10

apple apple