Comma separator between JSON objects with json.dump

Tags:

I am fiddling around with outputting a json file with some attributes of the files within a directory. My problem is, when appending to the file there is no separator between each object. I could just add a comma after each 'f' and delete the last one, but that seems like a sloppy work around to me.

import os
import os.path
import json

#Create and open file_data.txt and append 
with open('file_data.txt', 'a') as outfile:

    files = os.listdir(os.curdir)


    for f in files:

        extension = os.path.splitext(f)[1][1:]
        base = os.path.splitext(f)[0]
        name = f

        data = {
            "file_name" : name,
            "extension" : extension,
            "base_name" : base
                }

        json.dump(data, outfile)

This outputs:

{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"}{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"}{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}

What I would like is actual JSON:

{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"},{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"},{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}

236

asked Nov 05 '14 20:11

cassiusclay

1 Answers

What you're getting is not a JSON object, but a stream of separate JSON objects.

What you would like is still not a JSON object, but a stream of separate JSON objects with commas between them. That's not going to be any more parseable.*

_{* The JSON spec is simple enough to parse by hand, and it should be pretty clear that an object followed by another object with a comma in between doesn't match any valid production.}

If you're trying to create a JSON array, you can do that. The obvious way, unless there are memory issues, is to build a list of dicts, then dump that all at once:

output = []
for f in files:
    # ...
    output.append(data)
json.dump(output, outfile)

If memory is an issue, you have a few choices:

For a quick-and-dirty solution, you can fake it by writing the [, ,, and ] manually. (But note that it is not valid JSON to have an extra trailing comma after the last value, even if some decoders will accept it.)
You can wrap your loop up in a generator function that yields each data, and extend JSONEncoder to convert iterators to arrays. (Note that this is actually used as the example in the docs of why and how to extend JSONEncoder, although you might want to write a more memory-efficient implementation.)
You can look for a third-party JSON library that has some kind of built-in iterative streaming API.

However, it's worth considering what you're trying to do. Maybe a stream of separate JSON objects actually is the right file format/protocol/API for what you're trying to do. Because JSON is self-delimiting, there's really no reason to add a delimiter between separate values. (And it doesn't even help much with robustness, unless you use a delimiter that isn't going to show up all over the actual JSON, as , is.) For example, what you've got is exactly what JSON-RPC is supposed to look like. If you're just asking for something different because you don't know how to parse such a file, that's pretty easy. For example (using a string rather than a file for simplicity):

i = 0
d = json.JSONDecoder()
while True:
    try:
        obj, i = d.raw_decode(s, i)
    except ValueError:
        return
    yield obj

181

answered Oct 23 '22 15:10

abarnert

Related questions
                            
                                Python struct.calcsize length
                            
                                py.test: ImportError: No module named mysql
                            
                                Django - DateTimeField received a naive datetime
                            
                                scikit-learn roc_auc_score() returns accuracy values
                            
                                Multivariate Taylor approximation in sympy
                            
                                Ubuntu : Unable to correct problems, you have held broken packages
                            
                                Converting a list of dicts to a Pandas dataframe
                            
                                How can I call a sequence of functions until the return value meets some condition?
                            
                                Selecting (subsetting) by multi index in pandas dataframe
                            
                                Log-Log plot of pandas dataframe
                            
                                Check that Python dicts have same shape and keys
                            
                                How to sort a numpy array based on the values in a specific row?
                            
                                Rpy2 error wac-a-mole: R_USER not defined
                            
                                How to write a numpy array to a csv file?
                            
                                python 3 print generator
                            
                                Python generator how does it close a file handle when the for loop calling the generator returns suddenly?
                            
                                pytest fixture is always returning a function
                            
                                Why is 08 or 09 in Python invalid? [duplicate]
                            
                                Iterations through pixels in an image are terribly slow with python (OpenCV)
                            
                                ImportError: No module named clr when using CPython of python.org

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Comma separator between JSON objects with json.dump

Tags:

python

json

cassiusclay

People also ask

1 Answers

abarnert

Recent Activity

Donate For Us