Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter multiple properties of a parent child array which could be several tree level deep

TL;DR; To make it simple, how can I filter multiple properties of a parent child array which could be several tree level deep. This is for an Open Source datagrid lib used by several hundred users.

So I have an array of parent/children references, the children can also have children themselves and so on, there's no limit on the tree level depth. Also, I need to be able to filter from not just the property that has the tree structure but on any propertie(s), that are columns in a grid, of that array.

For example, I have this array, which is representing a file explorer list

const myFiles = [
    {id: 11, file: "Music", parentId: null },
    {id: 12, file: "mp3", parentId: 11 },
    {id: 14, file: "pop", parentId: 12 },
    {id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85, parentId: 14, },
    {id: 16, file: "rock", parentId: 12 },
    {id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16, },
    {id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null, },
    {id: 21, file: "Documents", parentId: null, },
    {id: 2, file: "txt", parentId: 21 },
    {id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2, },
    {id: 4, file: "pdf", parentId: 21 },
    {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
    {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, },
    {id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4, },
    {id: 7, file: "xls", parentId: 21 },
    {id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7, },
    {id: 9, file: "misc", parentId: 21 },
    {id: 10, file: "something.txt", dateModified: "2015-02-26", size: 0.4, parentId: 9, },
]

The array looks flat but in reality, it's a tree view structure that is represented in a datagrid as shown below.

What I found that partially works is to loop through the entire array and add the full list of files that each item can have including itself, for example, if Documents has a child PDF which has itself has a child Map.pdf, then the tree mapping can be represented by ["Documents", "PDF", "map.pdf"] and we store that on the parent object, then on the next child we store ["PDF", "map.pdf"] and finally on the last child we store ["map.pdf"] like so

    {id: 21, file: "Documents", parentId: null, treeMap: ["Documents", "PDF", "map.pdf"] }
    {id: 4, file: "pdf", parentId: 21, treeMap: ["PDF", "map.pdf"] }
    {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, treeMap: ["map.pdf"] }

and this is the method allowing me to do that

export function modifyDatasetToAddTreeMapping(items: any[], treeViewColumn: Column, dataView: any) {
  for (let i = 0; i < items.length; i++) {
    items[i]['treeMap'] = [items[i][treeViewColumn.id]];
    let item = items[i];

    if (item['parentId'] !== null) {
      let parent = dataView.getItemById(item['parentId']);

      while (parent) {
        parent['treeMap'] = dedupePrimitiveArray(parent['treeMap'].concat(item['treeMap']));
        item = parent;
        parent = dataView.getItemById(item['parentId']);
      }
    }
  }
}

export function dedupePrimitiveArray(inputArray: Array<number | string>): Array<number | string> {
  const seen = {};
  const out = [];
  const len = inputArray.length;
  let j = 0;
  for (let i = 0; i < len; i++) {
    const item = inputArray[i];
    if (seen[item] !== 1) {
      seen[item] = 1;
      out[j++] = item;
    }
  }
  return out;
}

Then the datagrid lib uses the Filter method which I can use this way, where columnFilters is an object containing 1 or more filter, for example const columnFilters = { file: 'map', size: '>3' }

The datagrid is a lib (SlickGrid) and it uses the filter method like so dataView.setFilter(treeFilter);

function treeFilter(dataView: any, item: any) {
    const columnFilters = { file: this.searchString.toLowerCase(), size: 2 };
    let filterCount = 0;

    if (item[parentPropName] !== null) {
      let parent = dataView.getItemById(item['parentId']);
      while (parent) {
        if (parent.__collapsed) {
          return false;
        }
        parent = dataView.getItemById(parent['parentId']);
      }
    }

    for (const columnId in columnFilters) {
      if (columnId !== undefined && columnFilters[columnId] !== '') {
        filterCount++;

        if (item.treeMap === undefined || !item.treeMap.find((itm: string) => itm.endsWith(columnFilters[columnId]))) {
          return false;
        }
      }
    }
    return true;
  }

With the call of modifyDatasetToAddTreeMapping() it works ok if I want to filter on the File column, but if I add more column filters, it doesn't work as intended. For example if you take a look at the 2nd print screen, you see that I entered "map" and that will display the "Documents > PDF > map.pdf" and that is great but if add a file size lower than 3Mb it shouldn't display "map.pdf" and because that file is not shown and "Documents > PDF" don't contain the word "map" then nothing should display, so as you can see the filter is not behaving as it should.

So with the current implementation, I have 2 problems 1. not behaving correctly when a tree node is not displayed, the parent of it shouldn't be displayed 2. having to call modifyDatasetToAddTreeMapping() is an extra call that might not be needed 3. it also modifies the source array, I could deep clone the array but that would be yet another expense in performance

This might be doable by recursion, after converting to a hierarchical structure (tree), but I can't figure out the best algorithm to do this if it's with recursion, isn't it expensive to always drill down the tree to find items?

Lastly, the intention is to use this with SlickGrid which might have 10k or even 50k rows so it has to be fast. You can see this SlickGrid demo but their implementation of the filtering is not correct, also I found the method add the mapping in this other SO Answer

NOTE: I would also like to point out that a resolution to this issue will benefit possibly a few hundred (or thousand) users since it is to be implement in Angular-Slickgrid and Aurelia-Slickgrid which are both Open Source lib and used by at least 300+ users.

enter image description here

Filtering with the word "map" should not return anything here since none of the nodes/children have that text. enter image description here

EDIT

The best code would be to plug whatever code does the job into a regular JS filter, so that means the end solution would be a method myFilter that would be a filter callback method. The reason I'm stuck with this is because I use an external lib SlickGrid and I have to use what that lib has available as public methods exposed.

function myFilter(item, args) {
  const columnFilters = args.columnFilters;

  // iterate through each items of the dataset
  // return true/false on each item
}

// to be used as a drop in
dataView.setFilterArgs({ columnFilters: this._columnFilters });
dataView.setFilter(myFilter.bind(this));

If I have const columnFilters = { file: "map", size: "<3.2" }; , the expected result of the array would be 4 lines

// result
[
  {id: 21, file: "Documents", parentId: null },
  {id: 4, file: "pdf", parentId: 21, },
  {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, }
]

If I have const columnFilters = { file: "map", size: "<3" }; , the expected result of the array would be 3 lines

// result
[
  {id: 21, file: "Documents", parentId: null },
  {id: 4, file: "pdf", parentId: 21, },
  {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
]

and finally if I have const columnFilters = { file: "map", size: ">3" }; then the expected result would be an empty array because none of the file have that char and file size conditions.

EDIT 2

From @AlexL's answer, it's starting to work. Just a couple of tweaking, but it looks very promising already enter image description here

EDIT 3

Thanks to Alex awesome work, his answer helped me merge this into my Open Source lib. I now have 2 live demos with Parent/Child ref (flat dataset) and with a Hierarchical Dataset (tree dataset). I wish I could upvote more than once :)

like image 232
ghiscoding Avatar asked Apr 04 '20 20:04

ghiscoding


People also ask

How do you add multiple conditions to a filter?

If you want to put multiple conditions in filter , you can use && and || operator.


1 Answers

I have a way to do it. It should be fairly performant but we might also want to swap out map and reduce etc. for good old for-loops to optimize speed further (I have seen various blogs and articles comparing speed of forEach, map etc. to for-loop and for-loops seem to win)

Here's a demo (also here: https://codepen.io/Alexander9111/pen/abvojzN):

const myFiles = [
  { id: 11, file: "Music", parentId: null },
  { id: 12, file: "mp3", parentId: 11 },
  { id: 14, file: "pop", parentId: 12 },
  { id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85,  parentId: 14 },
  { id: 16, file: "rock", parentId: 12 },
  { id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16 },
  { id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null },
  { id: 21, file: "Documents", parentId: null },
  { id: 2, file: "txt", parentId: 21 },
  { id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2 },
  { id: 4, file: "pdf", parentId: 21 },
  { id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  { id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4 },
  { id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4 },
  { id: 7, file: "xls", parentId: 21 },
  { id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7 },
  { id: 9, file: "misc", parentId: 21 },
  { id: 10,  file: "something.txt", dateModified: "2015-02-26", size: 0.4,  parentId: 9 }
];

//example how to use the "<3" string - better way than using eval():
const columnFilters = { file: "map", size: "<3.2" }; //, size: "<3" 
const isSizeValid = Function("return " + myFiles[11].size + "<3")();
//console.log(isSizeValid);

const myObj = myFiles.reduce((aggObj, child) => {
  aggObj[child.id] = child;
  //the filtered data is used again as each subsequent letter is typed
  //we need to delete the ._used property, otherwise the logic below
  //in the while loop (which checks for parents) doesn't work:
  delete aggObj[child.id]._used;
  return aggObj;
}, {});

function filterMyFiles(myArray, columnFilters){
  const filteredChildren = myArray.filter(a => {
    for (let key in columnFilters){
      //console.log(key)      
      if (a.hasOwnProperty(key)){
        const strContains =  String(a[key]).includes(columnFilters[key]);
        const re = /(?:(?:^|[-+<>=_*/])(?:\s*-?\d+(\.\d+)?(?:[eE][+-<>=]?\d+)?\s*))+$/;
        const comparison = re.test(columnFilters[key]) && Function("return " + a[key] + columnFilters[key])();
        if (strContains || comparison){
          //don't return true as need to check other keys in columnFilters
        }else{
          //console.log('false', a)
          return false;
        }
      } else{
        return false;
      }           
    }
    //console.log('true', a)
    return true;
  })
  return filteredChildren;
}

const initFiltered = filterMyFiles(myFiles, columnFilters);

const finalWithParents = initFiltered.map(child => {
  const childWithParents = [child];
  let parent = myObj[child.parentId];
  while (parent){
    //console.log('parent', parent)
    parent._used || childWithParents.unshift(parent)
    myObj[parent.id]._used = true;
    parent = myObj[parent.parentId] || false;    
  }
  return childWithParents;
}).flat();

console.log(finalWithParents)
.as-console-wrapper { max-height: 100% !important; top: 0; }

Basically set up an object to later use for finding all parents.

Then do one filter (i.e. one iteration of the array) and filter for those that match the conditions in the columnFilters object.

Then map (i.e. one iteration) over this filtered array and find each parents using the object created at the start (so nested iterations up to N depth).

flatten the array with .flat() (assumed one final iteration) and then we are done.

Any questions let me know.

Update - For-loop approach plus trying to reduce iterations over array

Cut out a couple of iterations :) (https://codepen.io/Alexander9111/pen/MWagdVz):

const myFiles = [
  { id: 11, file: "Music", parentId: null },
  { id: 12, file: "mp3", parentId: 11 },
  { id: 14, file: "pop", parentId: 12 },
  { id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85,  parentId: 14 },
  { id: 16, file: "rock", parentId: 12 },
  { id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16 },
  { id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null },
  { id: 21, file: "Documents", parentId: null },
  { id: 2, file: "txt", parentId: 21 },
  { id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2 },
  { id: 4, file: "pdf", parentId: 21 },
  { id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  { id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4 },
  { id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4 },
  { id: 7, file: "xls", parentId: 21 },
  { id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7 },
  { id: 9, file: "misc", parentId: 21 },
  { id: 10,  file: "something.txt", dateModified: "2015-02-26", size: 0.4,  parentId: 9 }
];

const columnFilters = { file: "map", size: "<3.2" };
console.log(customLocalFilter(myFiles, columnFilters));

function customLocalFilter(array, filters){  
  const myObj = {};
  for (let i = 0; i < myFiles.length; i++) {
    myObj[myFiles[i].id] = myFiles[i];
    //the filtered data is used again as each subsequent letter is typed
    //we need to delete the ._used property, otherwise the logic below
    //in the while loop (which checks for parents) doesn't work:
    delete myObj[myFiles[i].id]._used;
  }

  const filteredChildrenAndParents = [];
  for (let i = 0; i < myFiles.length; i++) {
    const a = myFiles[i];
    let matchFilter = true;
    for (let key in columnFilters) {
      if (a.hasOwnProperty(key)) {
        const strContains = String(a[key]).includes(columnFilters[key]);
        const re = /(?:(?:^|[-+<>!=_*/])(?:\s*-?\d+(\.\d+)?(?:[eE][+-<>!=]?\d+)?\s*))+$/;
        const comparison =
          re.test(columnFilters[key]) &&
          Function("return " + a[key] + columnFilters[key])();
        if (strContains || comparison) {
          //don't return true as need to check other keys in columnFilters
        } else {
          matchFilter = false;
          continue;
        }
      } else {
        matchFilter = false;
        continue;
      }
    }
    if (matchFilter) {
      const len = filteredChildrenAndParents.length;
      filteredChildrenAndParents.splice(len, 0, a);
      let parent = myObj[a.parentId] || false;
      while (parent) {
        //only add parent if not already added:
        parent._used || filteredChildrenAndParents.splice(len, 0, parent);
        //mark each parent as used so not used again:
        myObj[parent.id]._used = true;
        //try to find parent of the current parent, if exists:
        parent = myObj[parent.parentId] || false;
      }
    }
  }
  return filteredChildrenAndParents;
}
.as-console-wrapper { max-height: 100% !important; top: 0; }
like image 199
Alex L Avatar answered Nov 02 '22 23:11

Alex L