Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dump to JSON adds additional double quotes and escaping of quotes

Tags:

python

json

I am retrieving Twitter data with a Python tool and dump these in JSON format to my disk. I noticed an unintended escaping of the entire data-string for a tweet being enclosed in double quotes. Furthermore, all double quotes of the actual JSON formatting are escaped with a backslash.

They look like this:

"{\"created_at\":\"Fri Aug 08 11:04:40 +0000 2014\",\"id\":497699913925292032,

How do I avoid that? It should be:

{"created_at":"Fri Aug 08 11:04:40 +0000 2014" .....

My file-out code looks like this:

with io.open('data'+self.timestamp+'.txt', 'a', encoding='utf-8') as f:             f.write(unicode(json.dumps(data, ensure_ascii=False)))             f.write(unicode('\n')) 

The unintended escaping causes problems when reading in the JSON file in a later processing step.

like image 797
toobee Avatar asked Aug 11 '14 11:08

toobee


People also ask

How do I ignore double quotes in JSON?

if you want to escape double quote in JSON use \\ to escape it.

Does JSON always use double quotes?

JSON names require double quotes.

How do you remove double quotes from a JSON string in Python?

Using the strip() Function to Remove Double Quotes from String in Python. We use the strip() function in Python to delete characters from the start or end of the string. We can use this method to remove the quotes if they exist at the start or end of the string.

What is the difference between JSON dump and JSON dumps?

json. dump() method used to write Python serialized object as JSON formatted data into a file. json. dumps() method is used to encodes any Python object into JSON formatted String.


1 Answers

You are double encoding your JSON strings. data is already a JSON string, and doesn't need to be encoded again:

>>> import json >>> not_encoded = {"created_at":"Fri Aug 08 11:04:40 +0000 2014"} >>> encoded_data = json.dumps(not_encoded) >>> print encoded_data {"created_at": "Fri Aug 08 11:04:40 +0000 2014"} >>> double_encode = json.dumps(encoded_data) >>> print double_encode "{\"created_at\": \"Fri Aug 08 11:04:40 +0000 2014\"}" 

Just write these directly to your file:

with open('data{}.txt'.format(self.timestamp), 'a') as f:     f.write(data + '\n') 
like image 194
Martijn Pieters Avatar answered Sep 28 '22 05:09

Martijn Pieters