Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Crossfilter require a flat data structure?

All the examples of Crossfilter I've found use a flat structure like this:

[
  { name: “Rusty”,  type: “human”, legs: 2 },
  { name: “Alex”,   type: “human”, legs: 2 },
  ...
  { name: “Fiona”,  type: “plant”, legs: 0 }
]

or

"date","open","high","low","close","volume","oi" 11/01/1985,115.48,116.78,115.48,116.28,900900,0 11/04/1985,116.28,117.07,115.82,116.04,753400,0 11/05/1985,116.04,116.57,115.88,116.44,876800,0

I have hundreds of MBs of flat files I process to yield a 1-2MB JSON object with a structure roughly like:

{
  "meta": {"stuff": "here"},
  "data": {
    "accountName": {
      // rolled up by week
      "2013-05-20": {
        // any of several "dimensions"
        "byDay": {
          "2013-05-26": {
            "values": {
              "thing1": 1,
              "thing2": 2,
              "etc": 3
            }
          },
          "2013-05-27": {
            "values": {
              "thing1": 4,
              "thing2": 5,
              "etc": 6
            }
          }
          // and so on for day
        },
        "bySource": {
          "sourceA": {
            "values": {
              "thing1": 2,
              "thing2": 6,
              "etc": 7
            }
          },
          "sourceB": {
            "values": {
              "thing1": 3,
              "thing2": 1,
              "etc": 2
            }
          }
        }
      }
    }
  }
}

Which I'd like to display as a table like:

Group: byDay* || bySource || byWhatever

           | thing1 | thing2 | etc
2013-05-26 |      1 |      2 |   2
2013-05-27 |      4 |      5 |   7

or:

Group: byDay || bySource* || byWhatever

           | thing1 | thing2 | etc
sourceA    |      2 |      6 |   6
sourceB    |      3 |      1 |   3

Flattening this JSON structure would be difficult and yield a very large object.

I'd love to take advantage of Crossfilter's wonderful features, but I'm unsure if it's possible.

Is it possible for me to define/explain my current structure to Crossfilter? Perhaps there's another way I could approach this? I'll readily admit that I don't have a good grasp on dimensions and many other key Crossfilter concepts.

like image 480
user2487135 Avatar asked Jun 14 '13 18:06

user2487135


1 Answers

Crossfilter works on an array of records, with each element of the array being mapped to one or more values via dimensions (which are defined using accessor functions).

Even if your data contains aggregate results, you can use this with Crossfilter, but note that it's technically impossible to combine data that has been aggregated across different dimensions, such as combining the "by day" and "by source" data in your example above. You could create a Crossfilter for each aggregated dimension, e.g. one for "by day", and run queries and groups on this, but I'm not sure how useful that would be compared with what you already have.

As for memory usage, are you sure flattening your flattened structure would really be that problematic? Bear in mind that each record (element of the flattened array) can contain references to strings and other objects in your nested structure, so you wouldn't necessarily use up all that much memory.

like image 84
Jason Davies Avatar answered Oct 22 '22 22:10

Jason Davies