Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Picking Out Simple Properties from Hierarchical JSON

Tags:

c#

json.net

* Despite the edit to my title by another user, I am seeking a solution that uses JSON.NET's library from C# *

A reply containing psuedocode is fine! :)

I'm trying to work with hierarchical data provided by a JSON dataset. I'm using C# and JSON.NET. I'm open to using Linq in general and Linq for JSON.NET in particular if it would help; otherwise, using non-Linq C#/JSON.NET is fine.

Ideally, I am trying to accomplish two things elegantly:

  1. I want to extract JSON that represents each branch and that branch's own properties--not its child (nested) branch objects (I will explain more in a moment).

  2. I want to track the parent node as I create my branch objects.

For further consideration, please refer to the following JSON excerpt:

{
  "Branch1": {
    "Prop1A" : "1A",
    "Prop1B" : "1B",
    "Prop1C" : "1C",
    "Branch2" : {
      "Prop2A" : "2A",
      "Prop2B" : "2B",
      "Prop2C" : "2C",
      "Branch3" : {
        "Prop3A" : "3A",
        "Prop3B" : "3B",
        "Prop3C" : "3C"
      }
    }
  }
}

Related to Goal 1 (from above): Given JSON that is composed of nested JSON objects, I want to pick out only the simple (string) properties for each branch. For instance, I would like to extract the JSON for Branch1 that would contain only Prop1A, Prop1B, and Prop1C properties. I would then like to extract the JSON for Branch2 that would contain only Prop2A, Prop2B, and Prop2C properties, etc. I realize that I can represent the entire JSON as a JSON.NET JToken object then iterate through its Children() and look only for JTokenType.Property types, but perhaps there is a more elegant way to quickly pick out just the property types using Linq...? In the end, I would have three separate JSON objects that would look like this:

JSON Object 1:

{
  "Prop1A" : "1A",
  "Prop1B" : "1B",
  "Prop1C" : "1C"
}

JSON Object 2:

{
  "Prop2A" : "2A",
  "Prop2B" : "2B",
  "Prop2C" : "2C"
}

JSON Object 3:

{ "Prop3A" : "3A", "Prop3B" : "3B", "Prop3C" : "3C" }

Related to Goal 2 (from above): Ideally, each extracted JSON above would also have a property indicating its parent. Thus, the final JSON objects would look something like this:

{
  "Prop1A" : "1A",
  "Prop1B" : "1B",
  "Prop1C" : "1C",
  "Parent" : ""
}

And:

{
  "Prop2A" : "2A",
  "Prop2B" : "2B",
  "Prop2C" : "2C",
  "Parent" : "Branch1"
}

And:

{
  "Prop3A" : "3A",
  "Prop3B" : "3B",
  "Prop3C" : "3C",
  "Parent" : "Branch2"
}

Any thoughts?

like image 934
Jazimov Avatar asked Jul 23 '16 15:07

Jazimov


1 Answers

You can use JContainer.DescendantsAndSelf() to find all objects in the JSON hierarchy, then for each object, loop through its properties and filter out those whose value is a JValue primitive. Thus the following query creates a List<JObject> containing the property names and values you require:

var root = (JContainer)JToken.Parse(jsonString);

var query1 = from o in root.DescendantsAndSelf().OfType<JObject>()      // Find objects
             let l = o.Properties().Where(p => p.Value is JValue)       // Select their primitive properties
             where l.Any()                                              // Skip objects with no properties
             select new JObject(l);                                     // And return a JObject

var list1 = query1.ToList();

To always skip the root object even if it has primitive properties, use JContainer.Descendants(). And if you really only want string-valued properties (rather than primitive properties), you can check the JToken.Type property:

             let l = o.Properties().Where(p => p.Value.Type == JTokenType.String)       // Select their string-valued properties

The query can be enhanced to include a synthetic "Parent" property giving the name of the immediate parent property containing the object, using JToken.Ancestors:

var query2 = from o in root.DescendantsAndSelf().OfType<JObject>()      // Find objects
             let l = o.Properties().Where(p => p.Value is JValue)       // Select their primitive properties
             where l.Any()                                              // Skip objects with no properties
             // Add synthetic "Parent" property
             let l2 = l.Concat(new[] { new JProperty("Parent", o.Ancestors().OfType<JProperty>().Select(a => a.Name).FirstOrDefault() ?? "") })
             select new JObject(l2);                                    // And return a JObject.

var list2 = query2.ToList();

However, in your desired output you seem to want the property name of the parent of the object, rather than the property name of the object. If so, you can do:

var query3 = from o in root.DescendantsAndSelf().OfType<JObject>()      // Find objects
             let l = o.Properties().Where(p => p.Value is JValue)       // Select their primitive properties
             where l.Any()                                              // Skip objects with no properties
             // Add synthetic "Parent" property
             let l2 = l.Concat(new[] { new JProperty("Parent", o.Ancestors().OfType<JProperty>().Skip(1).Select(a => a.Name).FirstOrDefault() ?? "") })
             select new JObject(l2);                                    // And return a JObject.

var list3 = query3.ToList();

For the final query, if I do:

Console.WriteLine(JsonConvert.SerializeObject(list3, Formatting.Indented));

The following output is generated, showing the JObject list has the contents you require:

[
  {
    "Prop1A": "1A",
    "Prop1B": "1B",
    "Prop1C": "1C",
    "Parent": ""
  },
  {
    "Prop2A": "2A",
    "Prop2B": "2B",
    "Prop2C": "2C",
    "Parent": "Branch1"
  },
  {
    "Prop3A": "3A",
    "Prop3B": "3B",
    "Prop3C": "3C",
    "Parent": "Branch2"
  }
]

Note that if the JSON objects themselves have a property named "Parent", the JObject constructor may throw a duplicated key exception.

like image 97
dbc Avatar answered Nov 08 '22 04:11

dbc