Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keep variable type in json.dump

Tags:

python

json

csv

I am reading a bunch of csv files which lines contain characters and numbers as follows:

"T55BSU","@","IT-196","IT","NO","@",1,385.82,1.4011825391667,"LFA","Economy", ...
"343OA3","A:1893BC6","FR-7139","FR","NO","@",1,805.01,1.4011825391667,"LFA","Economy", ...
...

I have a little python script which loops over the files and dumps their content into JSON format:

#!/usr/bin/python

import csv
import json

from os import listdir
from os.path import isfile, join

csvpath = "path to csv dir"
jsonpath = "path to json dir"

onlyfiles = [ f for f in listdir(csvpath) if isfile(join(csvpath,f)) ]

fieldnames = ("names of columns")

for files in onlyfiles:
    name = files.split('.')
    csvname = str(csvpath) + str(files)
    jsoname = str(jsonpath) + str(name[0]) + '.json'

    print "Opening " + str(csvname) + "\n"
    csvfile = open(csvname, 'r')

    print "Writing " + str(jsoname) + "\n"
    jsonfile = open(jsoname, 'w')

    reader = csv.DictReader(csvfile, fieldnames)

    for row in reader:
        json.dump(row, jsonfile)
        jsonfile.write('\n')

My problem is that all the values in the JSON file are converted to strings as such:

{"REFUND_SW": "N", "DEST_COUNTRY": "IT", "LOWCOST_CAR": "NO", "CURRATE": "1.4011825391667", "DEFAULT_CLIENT_GROUP_CD": "IT-196", "MAIN_SUPPLIER_CODE": "BV", "DEST_CITY": "ROME", "TRAVEL_PURPOSE": "C", "FARE_TYPE": "C", "CONNECTION_TIME": "0", "BOOKING_DATE": "2014-04-14", "FLIGHT_DURATION": "70"}

But, I would like:

{"REFUND_SW": "N", "DEST_COUNTRY": "IT", "LOWCOST_CAR": "NO", "CURRATE": 1.4011825391667, "DEFAULT_CLIENT_GROUP_CD": "IT-196", "MAIN_SUPPLIER_CODE": "BV", "DEST_CITY": "ROME", "TRAVEL_PURPOSE": "C", "FARE_TYPE": "C", "CONNECTION_TIME": 0, "BOOKING_DATE": "2014-04-14", "FLIGHT_DURATION": 70}

How do I force json.dump to not convert everything to strings? In the original csv file, they are written as numbers ...

Thanks

like image 487
Carlos Avatar asked May 29 '26 06:05

Carlos


1 Answers

The problem is not json.dumps, but the csv reader. Every value is interpreted as string (Read data from csv-file and transform to correct data-type)

If you know the data type of the columns you can convert them after reading:

#!/usr/bin/python

import csv
import json

csvfile = [
    '"name","age","grade"',
    '"ann",42,1.3',
    '"hans",23,1.7'
]
row_types = {'name': str, 'grade': float, 'age': int}

reader = csv.DictReader(csvfile)

jsonfile = open('test.json', 'w')
for row in reader:
    print('reader produces strings only:')
    print(row)
    print('convert to known types')
    row_converted = {k: row_types[k](v) for k, v in row.items()}
    print(row_converted)
    json.dump(row_converted, jsonfile)
    jsonfile.write('\n')
like image 155
jandob Avatar answered May 31 '26 19:05

jandob



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!