I started by reading a CSV into a Pandas Data Frame via the pandas read_csv() function. Now that the data is in an actual data frame, I tried to write something like this:
for row in df.iterrows():
row[1].to_json(path_to_file)
This works but only the last row is saved to disk because I've been rewriting the file each time I make a call to row[1].to_json(path_to_file). I've tried a few other file handling options but to no avail. Can anyone shed some insight on how to proceed?
Thank you!
To create newline-delimited json from a dataframe df
, run the following
df.to_json("path/to/filename.json",
orient="records",
lines=True)
Pay close attention to those optional keyword args! The lines
option was added in pandas 0.19.0
.
You can pass a buffer in to df.to_json()
:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({"a":[1,3,5], "b":[1.1,1.2,1.2]})
In [3]: df
Out[3]:
a b
0 1 1.1
1 3 1.2
2 5 1.2
In [4]: f = open("temp.txt", "w")
In [5]: for row in df.iterrows():
row[1].to_json(f)
f.write("\n")
...:
In [6]: f.close()
In [7]: open("temp.txt").read()
Out[7]: '{"a":1.0,"b":1.1}\n{"a":3.0,"b":1.2}\n{"a":5.0,"b":1.2}\n'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With