Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to trim all spaces except those within quotes for large JSON file

I am currently working on a large JSON file, and wish to shorten it by deleting all extra spaces, tabs, returns, etc. that are not within quotes. The file is some 100,000 lines of code and hard for my other scripts to use it quickly. The file originally looks like:

{
  "path": "/math/", 
  "id": "math", 
  "title": "Math Title",       
  "icon_url": "/images/power-mode/badges/circles-40x40.png",   
  "contains": [
    "Topic", 
    "Video", 
   "Exercise"
  ], 
  "children": [], 
  "parent_id": "root",
  "ancestor_ids": [
    "root"
  ], 
  "description": "null", 
  "kind": "Topic", 
  "h_position": -10,
  "v_position": 6, 
  "slug": "math"
}

and I wish for it to look like this after deleting unnecessary spaces, tabs,returns, etc:

{"path":"/math/","id":"math","title":"Math Title","icon_url":"/images/power-mode/badges/circles-40x40.png",     
"contains":["Topic","Video","Exercise"],"children":[],"parent_id":"root","ancestor_ids":["root"],   
"description": "null","kind":"Topic","h_position":-10,"v_position":6,"slug":"math"}

Basically every space should be deleted except for those within quotes.

like image 707
ibanez221 Avatar asked Jan 11 '23 01:01

ibanez221


2 Answers

You can use jq's -c or --compact-output option:

jq -c '' < your-file.json

Demo:

$ echo '
> {
>   "a": "b"
> }' | jq -c ''
{"a":"b"}
like image 136
Stephan202 Avatar answered Jan 19 '23 01:01

Stephan202


You could read the json into code and then output it to file specifying a compact format, your spaces within quotes will be preserved in the strings.

In python you could use the native json libraries

import json
json.loads(your filestream)
json.dumps(your output stream) // the native output of json.dumps is compact

Details in the python docs https://docs.python.org/2/library/json.html

But you should be able to do the same technique in any language that handles json.

like image 44
Patrick Avatar answered Jan 19 '23 02:01

Patrick