Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

converting a csv file to json + python with specific json format

Tags:

python

json

csv

can I convert a csv file into json as follows:
csv = headers in line1 with values below
json = [{"key1":"value1",...},{"key1":"value2",...}...]

This is my csv file:

$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"

This is my script:

$ cat csv_to_json.py

#!/usr/bin/python

#from here
#https://stackoverflow.com/a/7550352/2392358

import csv, json
csvreader = csv.reader(open('head_data.csv', 'rb'), delimiter='\t',
quotechar='"')
data = []
for row in csvreader:
    r = []
    for field in row:
        if field == '': field = None
        else: field = unicode(field, 'ISO-8859-1')
        r.append(field)
    data.append(r)
jsonStruct = {
    'header': data[0],
    'data': data[1:]
}
open('head_data.json', 'wb').write(json.dumps(jsonStruct))

Runnning my script and output

$ python csv_to_json.py


$ cat -v head_data.json
{"header": ["Rec Open Date,\"MSISDN\",\"IMEI\",\"Data Volume (Bytes)\",\"Device Manufacturer\",\"Device Model\",\"Product Description\""], "data": [["2016-05-30,\"686\",\"230\",\"63979\",\"Samsung SM-G935FD \",\"Samsung SM-G935FD\",\"$29.95 Carryover Plan (1GB)\""], ["2016-05-30,\"533\",\"970\",\"171631866\",\"Apple iPhone 6 (A1586)\",\"iPhone 6 (A1586)\",\"$69.95 Plan\""], ["2016-05-30,\"191\",\"610\",\"145713\",\"Samsung GT-I9195\",\"Samsung GT-I9195\",\"$29.95 Plan\""], ["2016-05-30,\"660\",\"660\",\"2994742\",\"Samsung SM-N920I\",\"Samsung SM-N920I\",\"GOVERNMENT TIER 2 PLAN\""], ["2016-05-30,\"182\",\"970\",\"37799939\",\"Samsung SM-J200Y\",\"Samsung SM-J200Y\",\"PREPAY PLUS - $0 -\""], ["2016-05-30,\"993\",\"360\",\"14096114\",\"Samsung SM-A300Y\",\"Samsung SM-A300Y\",\"$39.95 Carryover Plan\""], ["2016-05-30,\"894\",\"730\",\"9851177\",\"Samsung GT-N7105\",\"Samsung GT-N7105\",\"PREPAY STD - $0 - #2\""], ["2016-05-30,\"600\",\"070\",\"18420650\",\"Apple iPhone 5C (A1529)\",\"Apple iPhone 5C (A1529)\",\"PREPAY PLUS - $0 -\""], ["2016-05-30,\"234\",\"000\",\"1769661\",\"Galaxy S7 SM-G930F \",\"Galaxy S7 SM-G930F\",\"$39.95 Plan\""]]}

Can i slightly modify the code so that I can get output like this:

[{"Rec Open Date":"2016-07-03","MSISDN":540,"IMEI":990,"Data Volume (Bytes)":36671453,"Device Manufacturer":"HUAWEI Technologies Co Ltd","Device Model":"H1512","Product Description":"PREPAY PLUS - $0 -"},
{"Rec Open Date":"2016-07-03","MSISDN":334,"IMEI":340,"Data Volume (Bytes)":129835114,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone S (A1530)","Product Description":"$29.95 Plan"},
{"Rec Open Date":"2016-07-03","MSISDN":133,"IMEI":870,"Data Volume (Bytes)":42213030,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone 6 Plus (A1524)","Product Description":"$49.95 Plan"}]

related Q here and here

edit1 found this here but this does the conversion in the browser and I think it uses js.

EDIT2 - based on the answer below this is what I want

This is the file I want to convert

$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung,A, SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"

This is the script:

$ cat -v csv_to_json2.py
#!/usr/bin/python

#from here
#https://stackoverflow.com/a/38193687/2392358

import csv
import json
from collections import OrderedDict

dR=csv.DictReader(open("head_data.csv"))
oD=[ OrderedDict(
         sorted(dct.iteritems(),
                key=lambda item:dR.fieldnames.index(item[0])))
     for dct in dR ]

#print to terminal
print json.dumps(oD)

#write to file
#json.dump(oD,"head_op.json")
open('head_op.json', 'wb').write(json.dumps(oD))

Running the script:

$ python csv_to_json2.py
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]

This is the output:

$ cat -v head_op.json
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]
like image 681
HattrickNZ Avatar asked Apr 30 '26 18:04

HattrickNZ


2 Answers

If you don't care about key's order, just do:

import csv
import json
json.dumps(list(csv.DictReader(open("file.csv"))))

Check pretty printing section on the manual for more options, or do

json.dumps(list( csv.DictReader(open("file.csv")) ])).replace("}, ","},\n")

To get your expected output.


If you want ordered printing, you may order the keys via OrderedDict:

import csv
import json
from collections import OrderedDict

dR=csv.DictReader(open("/tmp/ah.csv"))
oD=[ OrderedDict(
         sorted(dct.iteritems(),
                key=lambda item:dR.fieldnames.index(item[0])))
     for dct in dR ]
json.dumps(oD)
like image 195
xvan Avatar answered May 02 '26 07:05

xvan


If you want to keep the order of the keys, don't use csv.DictReader since it overcomplicates things, just record the header and then zip it with each of the rows:

import csv
from collections import OrderedDict
reader = csv.reader(open("text.csv"))

header = next(reader)

data = [OrderedDict(zip(header,fields)) for fields in reader]

Then you can write it to a file with this:

import json

with open("new.json","w") as f:
    json.dump(data, f)
like image 30
Tadhg McDonald-Jensen Avatar answered May 02 '26 08:05

Tadhg McDonald-Jensen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!