Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to encode this data to parent / children structure in JSON

I am working with d3.js to visualise families of animals (organisms) (up to 4000 at a time) as a tree graph, though the data source could just as well be a directory listing, or list of namespaced objects. my data looks like:

json = {
    organisms:[
        {name: 'Hemiptera.Miridae.Kanakamiris'},
        {name: 'Hemiptera.Miridae.Neophloeobia.incisa'},
        {name: 'Lepidoptera.Nymphalidae.Ephinephile.rawnsleyi'},
        ... etc ...
    ]
}

my question is: I am trying to find the best way to convert the above data to the hierarchical parent / children data structure as is used by a number of the d3 visualisations such as treemap (for data example see flare.json in the d3/examples/data/ directory). Here is an example of the desired data structure:

{"name": "ROOT",
 "children": [
        {"name": "Hemiptera",
         "children": [
             {"name": "Miridae",
              "children": [
                  {"name": "Kanakamiris", "children":[]},
                  {"name": "Neophloeobia",
                   "children": [
                       {"name": "incisa", "children":[] }
                   ]}
              ]}
         ]},
        {"name": "Lepidoptera",
         "children": [
             {"name": "Nymphalidae",
              "children": [
                  {"name": "Ephinephile",
                   "children": [
                       {"name": "rawnsleyi", "children":[] }
                   ]}
              ]}
         ]}
    ]}
}

EDIT: enclosed all the original desired data structure inside a ROOT node, so as to conform with the structure of the d3 examples, which have only one master parent node.

I am looking to understand a general design pattern, and as a bonus I would love to see some solutions in either javascript, php, (or even python). javascript is my preference. In regards to php: the data I am actually using comes from a call to a database by a php script that encodes the results as json. database results in the php script is an ordered array (see below) if that is any use for php based answers.

Array
(
    [0] => Array
        (
            ['Rank_Order'] => 'Hemiptera'
            ['Rank_Family'] => 'Miridae'
            ['Rank_Genus'] => 'Kanakamiris'
            ['Rank_Species'] => ''
        ) ........

where: 'Rank_Order' isParentOf 'Rank_Family' isParentOf 'Rank_Genus' isParentOf 'Rank_Species'

I asked a similar question focussed on a php solution here, but the only answer is not working on my server, and I dont quite understand what is going on, so I want to ask this question from a design pattern perspective, and to include reference to my actual use which is in javascript and d3.js.

like image 839
johowie Avatar asked Aug 26 '12 01:08

johowie


People also ask

How do I encode a JSON to a string?

Use the JavaScript function JSON.stringify() to convert it into a string. const myJSON = JSON.stringify(obj); The result will be a string following the JSON notation.

What encoding methods does JSON support?

Character Encodings. The RFC recommends that JSON be represented using either UTF-8, UTF-16, or UTF-32, with UTF-8 being the default. Accordingly, this module uses UTF-8 as the default for its encoding parameter.

Which type of data structure is suitable for JSON format?

JSON defines only two data structures: objects and arrays. An object is a set of name-value pairs, and an array is a list of values.


3 Answers

The following is specific to the structure you've provided, it could be made more generic fairly easily. I'm sure the addChild function can be simplified. Hopefully the comments are helpful.

function toHeirarchy(obj) {

  // Get the organisms array
  var orgName, orgNames = obj.organisms;

  // Make root object
  var root = {name:'ROOT', children:[]};

  // For each organism, get the name parts
  for (var i=0, iLen=orgNames.length; i<iLen; i++) {
    orgName = orgNames[i].name.split('.');

    // Start from root.children
    children = root.children;

    // For each part of name, get child if already have it
    // or add new object and child if not
    for (var j=0, jLen=orgName.length; j<jLen; j++) {
      children = addChild(children, orgName[j]);      
    }
  }
  return root;

  // Helper function, iterates over children looking for 
  // name. If found, returns its child array, otherwise adds a new
  // child object and child array and returns it.
  function addChild(children, name) {

    // Look for name in children
    for (var i=0, iLen=children.length; i<iLen; i++) {

      // If find name, return its child array
      if (children[i].name == name) {
        return children[i].children;        
      }
    }
    // If didn't find name, add a new object and 
    // return its child array
    children.push({'name': name, 'children':[]});
    return children[children.length - 1].children;
  }
}
like image 74
RobG Avatar answered Sep 21 '22 02:09

RobG


Given your starting input I believe something like the following code will produce your desired output. I don't imagine this is the prettiest way to do it, but it's what came to mind at the time.

It seemed easiest to pre-process the data to first split up the initial array of strings into an array of arrays like this:

[
   ["Hemiptera","Miridae","Kanakamiris" ],
   ["Hemiptera","Miridae","Neophloeobia","incisa" ],
   //etc
]

...and then process that to get a working object in a form something like this:

  working = {
       Hemiptera : {
           Miridae : {
              Kanakamiris : {},
              Neophloeobia : {
                  incisa : {}
              }
           }
       },
       Lepidoptera : {
           Nymphalidae : {
              Ephinephile : {
                  rawnsleyi : {}
              }
           }
       }
    }

...because working with objects rather than arrays makes it easier to test whether child items already exist. Having created the above structure I then process it one last time to get your final desired output. So:

// start by remapping the data to an array of arrays
var organisms = data.organisms.map(function(v) {
        return v.name.split(".");
    });

// this function recursively processes the above array of arrays
// to create an object whose properties are also objects
function addToHeirarchy(val, level, heirarchy) {
    if (val[level]) {
        if (!heirarchy.hasOwnProperty(val[level]))
            heirarchy[val[level]] = {};
        addToHeirarchy(val, level + 1, heirarchy[val[level]]);
    }
}
var working = {};    
for (var i = 0; i < organisms.length; i++)
    addToHeirarchy(organisms[i], 0, working);

// this function recursively processes the object created above
// to create the desired final structure
function remapHeirarchy(item) {
    var children = [];
    for (var k in item) {
        children.push({
            "name" : k,
            "children" : remapHeirarchy(item[k])
        });
    }
    return children;
}

var heirarchy = {
    "name" : "ROOT",
    "children" : remapHeirarchy(working)
};

Demo: http://jsfiddle.net/a669F/1/

like image 40
nnnnnn Avatar answered Sep 24 '22 02:09

nnnnnn


An alternative answer to my own question....In the past day I have learn't a great deal more about d3.js and in relation to this question d3.nest() with .key() and .entries() is my friend (all d3 functions). This answer involves changing the initial data, so it may not qualify as a good answer to the specific question i asked. However if someone has a similar question and can change things on the server then this is a pretty simple solution:

return the data from the database in this format:

json = {'Organisms': [
    { 'Rank_Order': 'Hemiptera',
      'Rank_Family': 'Miridae',
      'Rank_Genus': 'Kanakamiris',
      'Rank_Species': '' },
    {}, ...
]}

Then using d3.nest()

organismNest = d3.nest()
    .key(function(d){return d.Rank_Order;})
    .key(function(d){return d.Rank_Family;})
    .key(function(d){return d.Rank_Genus;})
    .key(function(d){return d.Rank_Species;})
    .entries(json.Organism);

this returns:

{
key: "Hemiptera"
  values: [
    {
      key: "Cicadidae"
      values: [
        {
          key: "Pauropsalta "
          values: [
            {
              key: "siccanus"
              values: [
                       Rank_Family: "Cicadidae"
                       Rank_Genus: "Pauropsalta "
                       Rank_Order: "Hemiptera"
                       Rank_Species: "siccanus"
                       AnotherOriginalDataKey: "original data value"

etc etc, nested and lovely

This returns something very much similar to they array that I described as my desired format above in the question, with a few differences. In particular, There is no all enclosing ROOT element and also whereas they keys I originally wanted were "name" and "children" .nest() returns keys as "key" and "values" respectively. These alternatives keys are easy enough to use in d3.js by just defining appropriate data accessor functions (basic d3 concept) ... but that is getting beyond the original scope of the question ... hope that helps someone too

like image 23
johowie Avatar answered Sep 24 '22 02:09

johowie