Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Comma separator between JSON objects with json.dump




I am fiddling around with outputting a json file with some attributes of the files within a directory. My problem is, when appending to the file there is no separator between each object. I could just add a comma after each 'f' and delete the last one, but that seems like a sloppy work around to me.

import os
import os.path
import json

#Create and open file_data.txt and append 
with open('file_data.txt', 'a') as outfile:

    files = os.listdir(os.curdir)

    for f in files:

        extension = os.path.splitext(f)[1][1:]
        base = os.path.splitext(f)[0]
        name = f

        data = {
            "file_name" : name,
            "extension" : extension,
            "base_name" : base

        json.dump(data, outfile)

This outputs:

{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"}{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"}{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}

What I would like is actual JSON:

{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"},{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"},{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}

like image 236
cassiusclay Avatar asked Nov 05 '14 20:11


People also ask

Are JSON Comma Separated?

JSON object In a JSON message, an object is an unordered set of comma-separated name-value pairs that begins with a left brace ({) and ends with a right brace (}). Each name is followed by a colon (:).

What is the difference between JSON dumps and JSON dump?

json. dump() method used to write Python serialized object as JSON formatted data into a file. json. dumps() method is used to encodes any Python object into JSON formatted String.

What is JSON dumps () method?

The dump() method is used when the Python objects have to be stored in a file. The dumps() is used when the objects are required to be in string format and is used for parsing, printing, etc, . The dump() needs the json file name in which the output has to be stored as an argument.

1 Answers

What you're getting is not a JSON object, but a stream of separate JSON objects.

What you would like is still not a JSON object, but a stream of separate JSON objects with commas between them. That's not going to be any more parseable.*

* The JSON spec is simple enough to parse by hand, and it should be pretty clear that an object followed by another object with a comma in between doesn't match any valid production.

If you're trying to create a JSON array, you can do that. The obvious way, unless there are memory issues, is to build a list of dicts, then dump that all at once:

output = []
for f in files:
    # ...
json.dump(output, outfile)

If memory is an issue, you have a few choices:

  • For a quick-and-dirty solution, you can fake it by writing the [, ,, and ] manually. (But note that it is not valid JSON to have an extra trailing comma after the last value, even if some decoders will accept it.)
  • You can wrap your loop up in a generator function that yields each data, and extend JSONEncoder to convert iterators to arrays. (Note that this is actually used as the example in the docs of why and how to extend JSONEncoder, although you might want to write a more memory-efficient implementation.)
  • You can look for a third-party JSON library that has some kind of built-in iterative streaming API.

However, it's worth considering what you're trying to do. Maybe a stream of separate JSON objects actually is the right file format/protocol/API for what you're trying to do. Because JSON is self-delimiting, there's really no reason to add a delimiter between separate values. (And it doesn't even help much with robustness, unless you use a delimiter that isn't going to show up all over the actual JSON, as , is.) For example, what you've got is exactly what JSON-RPC is supposed to look like. If you're just asking for something different because you don't know how to parse such a file, that's pretty easy. For example (using a string rather than a file for simplicity):

i = 0
d = json.JSONDecoder()
while True:
        obj, i = d.raw_decode(s, i)
    except ValueError:
    yield obj
like image 181
abarnert Avatar answered Oct 23 '22 15:10
