I need to convert a pandas dataframe to a JSON object.
However
json.dumps(df.to_dict(orient='records'))
fails as the boolean columns are not JSON serializable since they are of type numpy.bool_. Now I've tried df['boolCol'] = df['boolCol'].astype(bool) but that still leaves the type of the fields as numpy.bool_ rather than the pyhton bool which serializes to JSON no problem.
Any suggestions on how to convert the columns without looping through every record and converting it?
Thanks
EDIT:
This is part of a whole sanitization of dataframes of varying content so they can be used as the JSON payload for an API. Hence we currently have something like this:
for cols in df.columns:
if type(df[cols][0]) == pd._libs.tslibs.timestamps.Timestamp:
df[cols] = df[cols].astype(str)
elif type(df[cols]) == numpy.bool_:
df[cols] = df[cols].astype(bool) #still numnpy bool afterwards!
Just tested it out, and the problem seems to be caused by the orient='records' parameter. Seems you have to set it to a option (e.g. list) and convert the results to your preferred format.
import numpy as np
import pandas as pd
column_name = 'bool_col'
bool_df = pd.DataFrame(np.array([True, False, True]), columns=[column_name])
list_repres = bool_df.to_dict('list')
record_repres = [{column_name: values} for values in list_repres[column_name]]
json.dumps(record_repres)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With