Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ramda.js: obtain a set of duplicates from an array of objects using specific properties

Given this array, containing javascript objects (json):
Each object has a bproperty, and a u property,

(each contains additional properties I am not concerned with for this exercise).

[
    { "b": "A", "u": "F", ... },
    { "b": "M", "u": "T", ... },
    { "b": "A", "u": "F", ... },
    { "b": "M", "u": "T", ... },
    { "b": "M", "u": "T", ... },
    { "b": "X", "u": "Y", ... },
    { "b": "X", "u": "G", ... },
]

I would like to use ramda to find a set of all the duplicates. The result should look something like this.

[ 
    { "b": "A", "u":"F" },
    { "b": "M", "u":"T" } 
]

These two entries have duplicates they are repeated 2 and 3 times in the original list respectively.

edit

I have found a solution using underscore, that keeps the original array elements, and splits them perfectly into singles and duplicates. I prefer ramda.js, and underscore doesn't just give a set of duplicates - as per the question, so I am leaving the question open until someone can answer using ramda. I am moving on with underscore until the question is answered.

I have a repl that finds the unique values... as a start...

like image 722
Jim Avatar asked Oct 30 '22 02:10

Jim


2 Answers

This seems overcomplicated and unlikely to be performant, but one options would be this:

const foo = pipe(
  project(['b', 'u']),
  reduce(
    ({results, foundOnce}, item) => contains(item, results)
      ? {results, foundOnce}
      : contains(item, foundOnce)
        ? {results: append(item, results), foundOnce}
        : {results, foundOnce: append(item, foundOnce)},
    {results: [], foundOnce: []}
  ), 
  prop('results')
)

foo(xs); //=> [{b: 'A', u: 'F'}, {b: 'M', u: 'T'}]

Perhaps this version is easier to understand, but it takes an extra iteration through the data:

const foo = pipe(
  project(['b', 'u']),
  reduce(
    ({results, foundOnce}, item) => contains(item, foundOnce)
        ? {results: append(item, results), foundOnce}
        : {results, foundOnce: append(item, foundOnce)},
    {results: [], foundOnce: []}
  ),
  prop('results'),
  uniq
)

repl here

like image 94
Scott Sauyet Avatar answered Nov 15 '22 05:11

Scott Sauyet


If you don't care about looping over your data multiple times, you could something like this:

  • Create partial copies that contain only the relevant props, using pick (your own idea)
  • use groupBy with a hash function to group similar objects. (Alternatively: sort first and use groupWith(equals))
  • Get the grouped arrays using values
  • Filter out arrays with only 1 item (those are not duped...) using filter
  • Map over the results and return the first element of each array using map(head)

In code:

const containsMoreThanOne = compose(lt(1), length);
const hash = JSON.stringify; // Naive.. watch out for key-order!

const getDups = pipe(
  map(pick(["b", "u"])),
  groupBy(hash),
  values,
  filter(containsMoreThanOne),
  map(head)
);

getDups(data);

Working demo in Ramda REPL.

A more hybrid approach would be to cramp all this logic in one reducer, but it looks kind of messy to me...

const clean = pick(["b", "u"]);
const hash = JSON.stringify;
const dupReducer = hash => (acc, o) => {
    const h = hash(o);
    // Mutate internal state
    acc.done[h] = (acc.done[h] || 0) + 1;
    if (acc.done[h] === 2) acc.result.push(o);

    return acc;
  };


const getDups = (clean, hash, data) =>
  reduce(dupReducer(hash), { result: [], done: { } }, map(clean, data)).result;

getDups(clean, hash, data);

REPL

like image 32
user3297291 Avatar answered Nov 15 '22 06:11

user3297291