Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove whitespaces and newlines from every value in a JSON file?

Tags:

python

json

strip

I have a JSON file that has the following structure:

{
    "name":[
        {
            "someKey": "\n\n   some Value   "
        },
        {
            "someKey": "another value    "
        }
    ],
    "anotherName":[
        {
            "anArray": [
                {
                    "key": "    value\n\n",
                    "anotherKey": "  value"
                },
                {
                    "key": "    value\n",
                    "anotherKey": "value"
                }
            ]
        }
    ]
}

Now I want to strip off all he whitespaces and newlines for every value in the JSON file. Is there some way to iterate over each element of the dictionary and the nested dictionaries and lists?

like image 829
John West Avatar asked Jun 13 '13 22:06

John West


People also ask

How do I remove all spaces from a JSON file?

First remove the leading and trailing spaces from the key using trim() function. Remove all the spaces, newline or tab present in the key using replace() function and add underscore "_" between the word of the key instead of whitespace.

Does JSON care about newlines?

JSON strings do not allow real newlines in its data; it can only have escaped newlines.

How do you strip JSON in Python?

To delete a JSON object from a list: Parse the JSON object into a Python list of dictionaries. Use the enumerate() function to iterate over the iterate over the list. Check if each dictionary is the one you want to remove and use the pop() method to remove the matching dict.

How does JSON handle newlines?

In JSON object make sure that you are having a sentence where you need to print in different lines. Now in-order to print the statements in different lines we need to use '\\n' (backward slash). As we now know the technique to print in newlines, now just add '\\n' wherever you want.


2 Answers

Now I want to strip off all he whitespaces and newlines for every value in the JSON file

Using pkgutil.simplegeneric() to create a helper function get_items():

import json
import sys
from pkgutil import simplegeneric

@simplegeneric
def get_items(obj):
    while False: # no items, a scalar object
        yield None

@get_items.register(dict)
def _(obj):
    return obj.items() # json object. Edit: iteritems() was removed in Python 3

@get_items.register(list)
def _(obj):
    return enumerate(obj) # json array

def strip_whitespace(json_data):
    for key, value in get_items(json_data):
        if hasattr(value, 'strip'): # json string
            json_data[key] = value.strip()
        else:
            strip_whitespace(value) # recursive call


data = json.load(sys.stdin) # read json data from standard input
strip_whitespace(data)
json.dump(data, sys.stdout, indent=2)

Note: functools.singledispatch() function (Python 3.4+) would allow to use collections' MutableMapping/MutableSequence instead of dict/list here.

Output

{
  "anotherName": [
    {
      "anArray": [
        {
          "anotherKey": "value", 
          "key": "value"
        }, 
        {
          "anotherKey": "value", 
          "key": "value"
        }
      ]
    }
  ], 
  "name": [
    {
      "someKey": "some Value"
    }, 
    {
      "someKey": "another value"
    }
  ]
}
like image 68
jfs Avatar answered Sep 19 '22 08:09

jfs


Parse the file using JSON:

import json
file = file.replace('\n', '')    # do your cleanup here
data = json.loads(file)

then walk through the resulting data structure.

like image 25
Brent Washburne Avatar answered Sep 21 '22 08:09

Brent Washburne