Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse large JSON file with the bulk of the data inside a property of the root object

Tags:

c#

json.net

I'm working on parsing a relatively large (1.01GB) JSON file using the JSON.NET framework that is structured like below:

{
  "Stores": [
    {
        "Number": 1234,
        "City": "Denver",
        "County": "Denver",
        "Departments": [
            {
                "Name": "Deli",
                "Description": "sliced lunchmeat"
            },
            {
                "Name": "Produce",
                "Description": "fruit and vegetables"
            }
        ],
        "Telephone": "555-1212"
    },
    {
        "Number": 5678,
        "City": "Parker",
        "County": "Douglas",
        "Departments": [
            {
                "Name": "Seafood",
                "Description": "creatures of the sea"
            },
            {
                "Name": "Meat",
                "Description": ""
            }
        ],
        "Telephone": "555-2323"
    }
  ]
}

I attempted to parse the whole JSON file at once, but ran into out-of-memory exceptions, so I am now trying to parse the file in chunks. I'm able to get it working using the code below, but only if I remove the root Stores object. The actual file that I have to parse has the root Stores object, so I need to figure out how to get this to work. I'm relatively new to JSON, so any help would be greatly appreciated. Thanks.

using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
using (StreamReader sr = new StreamReader(fs))
using (JsonTextReader reader = new JsonTextReader(sr))
{
    while (reader.Read())
    {
        DepartmentInfo = "";

        if (reader.TokenType == JsonToken.StartObject)
        {
            dynamic obj = JObject.Load(reader);

            City = obj.City.ToString();
            County = obj.County.ToString();

            foreach (var Department in obj.Departments)
            {
                DepartmentInfo += Department.ToString();
            }
        }
    }
}

UPDATE: I updated the JSON to show the actual format I'm working with, it was missing a Description field in the Departments array. Also a little more context on what I'm doing - I'm loading this file into a database, and have a requirement to keep it looking as close to the source as possible so it can be tracked, and then further processing occurs in the database to normalize the data into another data model, so the DepartmentInfo is kind of gnarly by design. The final code is below. Thanks!

string DepartmentInfo = "";

using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
using (StreamReader sr = new StreamReader(fs))
using (JsonTextReader reader = new JsonTextReader(sr))
{
    while (reader.TokenType != JsonToken.StartArray)
        reader.Read();

    while(reader.Read())
    {
        DepartmentInfo = "";

        if (reader.TokenType == JsonToken.StartObject)
        {
            dynamic obj = JObject.Load(reader);

            var StoreNumber = obj["Number"];
            var City = obj["City"].ToString();
            var County = obj["County"].ToString();
            var PhoneNumber = obj["Telephone"];

            foreach (var Department in obj.Departments)
            {
                DepartmentInfo += ("Name: " + Department.Name.ToString() + ", Description: " + Department.Description.ToString() + " ");
            }


        }
    }
}
like image 482
Dave Avatar asked Jan 23 '16 03:01

Dave


People also ask

What is JSON parse () method?

The JSON.parse() method parses a JSON string, constructing the JavaScript value or object described by the string. An optional reviver function can be provided to perform a transformation on the resulting object before it is returned.

How do you parse a JSON file?

Example - Parsing JSONUse the JavaScript function JSON.parse() to convert text into a JavaScript object: const obj = JSON.parse('{"name":"John", "age":30, "city":"New York"}'); Make sure the text is in JSON format, or else you will get a syntax error.

How do I parse a large JSON file in node JS?

To parse large JSON file in Node. js, we call fs. createReadStream and use the JSONStream library. const fs = require("fs"); const JSONStream = require("JSONStream"); const getStream = () => { const jsonData = "myData.


1 Answers

You're on the right track. You just need a little bit of code to advance the reader to the beginning of the Stores array, then process from there using the code you already have. Here is the revised code:

using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
using (StreamReader sr = new StreamReader(fs))
using (JsonTextReader reader = new JsonTextReader(sr))
{
    // Advance the reader to start of first array, 
    // which should be value of the "Stores" property
    while (reader.TokenType != JsonToken.StartArray)
        reader.Read();

    // Now process each store individually
    while (reader.Read())
    {
        if (reader.TokenType == JsonToken.StartObject)
        {
            dynamic obj = JObject.Load(reader);

            // ...
        }
    }
}

Here is a working fiddle demonstrating the concept. Note I'm using a hardcoded JSON string and a MemoryStream instead of a FileStream but the result is the same. I've also made some tweaks to make the department output more readable.

like image 189
Brian Rogers Avatar answered Sep 20 '22 00:09

Brian Rogers