I am fiddling around with outputting a json file with some attributes of the files within a directory. My problem is, when appending to the file there is no separator between each object. I could just add a comma after each 'f' and delete the last one, but that seems like a sloppy work around to me.
import os
import os.path
import json
#Create and open file_data.txt and append
with open('file_data.txt', 'a') as outfile:
files = os.listdir(os.curdir)
for f in files:
extension = os.path.splitext(f)[1][1:]
base = os.path.splitext(f)[0]
name = f
data = {
"file_name" : name,
"extension" : extension,
"base_name" : base
}
json.dump(data, outfile)
This outputs:
{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"}{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"}{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}
What I would like is actual JSON:
{"file_name": "contributors.txt", "base_name": "contributors", "extension": "txt"},{"file_name": "read_files.py", "base_name": "read_files", "extension": "py"},{"file_name": "file_data.txt", "base_name": "file_data", "extension": "txt"}{"file_name": ".git", "base_name": ".git", "extension": ""}
JSON object In a JSON message, an object is an unordered set of comma-separated name-value pairs that begins with a left brace ({) and ends with a right brace (}). Each name is followed by a colon (:).
json. dump() method used to write Python serialized object as JSON formatted data into a file. json. dumps() method is used to encodes any Python object into JSON formatted String.
The dump() method is used when the Python objects have to be stored in a file. The dumps() is used when the objects are required to be in string format and is used for parsing, printing, etc, . The dump() needs the json file name in which the output has to be stored as an argument.
What you're getting is not a JSON object, but a stream of separate JSON objects.
What you would like is still not a JSON object, but a stream of separate JSON objects with commas between them. That's not going to be any more parseable.*
* The JSON spec is simple enough to parse by hand, and it should be pretty clear that an object followed by another object with a comma in between doesn't match any valid production.
If you're trying to create a JSON array, you can do that. The obvious way, unless there are memory issues, is to build a list of dicts, then dump that all at once:
output = []
for f in files:
# ...
output.append(data)
json.dump(output, outfile)
If memory is an issue, you have a few choices:
[
, ,
, and ]
manually. (But note that it is not valid JSON to have an extra trailing comma after the last value, even if some decoders will accept it.)data
, and extend JSONEncoder
to convert iterators to arrays. (Note that this is actually used as the example in the docs of why and how to extend JSONEncoder
, although you might want to write a more memory-efficient implementation.)However, it's worth considering what you're trying to do. Maybe a stream of separate JSON objects actually is the right file format/protocol/API for what you're trying to do. Because JSON is self-delimiting, there's really no reason to add a delimiter between separate values. (And it doesn't even help much with robustness, unless you use a delimiter that isn't going to show up all over the actual JSON, as ,
is.) For example, what you've got is exactly what JSON-RPC is supposed to look like. If you're just asking for something different because you don't know how to parse such a file, that's pretty easy. For example (using a string rather than a file for simplicity):
i = 0
d = json.JSONDecoder()
while True:
try:
obj, i = d.raw_decode(s, i)
except ValueError:
return
yield obj
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With