Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSON Data Optimization by removing repeated column names

Tags:

json

I have a basic Json question - I have a JSON file. Every object in this file has columns repeated.

[
  {
    id: 1,
    name: "ABCD"
  },
  {
    id: 2,
    name: "ABCDE"
  },
  {
    id: 3,
    name: "ABCDEF"
  }
]

For optimization I was thinking to remove repeated column names.

{
    "cols": [
        "id",
        "name"
    ],
    "rows": [
        [
            "1",
            "ABCD"
        ],
        [
            "2",
            "ABCDE"
        ]
    ]
}

What I am trying to understand is - is this a better approach? Are there any disadvantages of this format? Say for writing unit tests?

like image 601
user1401472 Avatar asked Sep 12 '25 21:09

user1401472


2 Answers

EDIT

The second case (after your editing) is valid json. You can derive it to the following class using json2csharp

public class RootObject
{
    public List<string> cols { get; set; }
    public List<List<string>> rows { get; set; }
}

The very important point to note about a valid json is that it has no other way but to repeat the column names (or, keys in general) to represent values in json. You can test the validity of your json putting it @ jsonlint.com

But if you want to optimize json by compressing it using some compression library like gzip (likewise), then I would recommend Json.HPack.

According to this format, it has many compression levels ranging from 0 to 4 (4 is the best).

At compression level 0:

you have to remove keys (property names) from the structure creating a header on index 0 with each property name. Then your compressed json would look like:

[
  [
    "id",
    "name"
  ],
  [
    1,
    "ABCD"
  ],
  [
    2,
    "ABCDE"
  ],
  [
    3,
    "ABCDEF"
  ]
]

In this way, you can compress your json at any levels as you want. But in order to work with any json library, you must have to decompress it to valid json first like the one you provided earlier with repeated property names.

For your kind information, you can have a look at the comparison between different compression techniques:

enter image description here

like image 197
Wasif Hossain Avatar answered Sep 15 '25 19:09

Wasif Hossain


{
   "cols": [
       "id",
       "name"
   ],
   "rows": [
       "1",
       "ABCD"
   ], [
       "2",
       "ABCDE"     
   ], [
       "3",
       "ABCDEF"
  ]
}

In this approach it will be hard to determine which value stand for which item (id,name). Your first approach was good if you use this JSON for communication.

like image 34
Rashad Avatar answered Sep 15 '25 20:09

Rashad