i have actually a pandas dataframe and i want to save it to json format. From the pandas docs it says:
Note NaN‘s, NaT‘s and None will be converted to null and datetime objects will be converted based on the date_format and date_unit parameters
Then using the orient option records
i have something like this
[{"A":1,"B":4,"C":7},{"A":null,"B":5,"C":null},{"A":3,"B":null,"C":null}]
Is it possible to have this instead:
[{"A":1,"B":4,"C":7},{"B":5},{"A":3}]'
Thank you
isnull is an alias for DataFrame. isna. Detect missing values. Return a boolean same-sized object indicating if the values are NA. NA values, such as None or numpy.
Python | Pandas isnull() and notnull() While making a Data Frame from a csv file, many blank columns are imported as null value into the Data Frame which later creates problems while operating that data frame. Pandas isnull() and notnull() methods are used to check and manage NULL values in a data frame.
The solution above doesn't actually produce results in the 'records' format. This solution also uses the json package, but produces exactly the result asked for in the original question.
import pandas as pd
import json
json.dumps([row.dropna().to_dict() for index,row in df.iterrows()])
Additionally, if you want to include the index (and you are on Python 3.5+) you can do:
json.dumps([{'index':index, **row.dropna().to_dict()} for index,row in df.iterrows()])
The following gets close to what you want, essentially we create a list of the non-NaN values and then call to_json
on this:
In [136]:
df.apply(lambda x: [x.dropna()], axis=1).to_json()
Out[136]:
'{"0":[{"a":1.0,"b":4.0,"c":7.0}],"1":[{"b":5.0}],"2":[{"a":3.0}]}'
creating a list is necessary here otherwise it will try to align the result with your original df shape and this will reintroduce the NaN
values which is what you want to avoid:
In [138]:
df.apply(lambda x: pd.Series(x.dropna()), axis=1).to_json()
Out[138]:
'{"a":{"0":1.0,"1":null,"2":3.0},"b":{"0":4.0,"1":5.0,"2":null},"c":{"0":7.0,"1":null,"2":null}}'
also calling list
on the result of dropna
will broadcast the result with the shape, like filling:
In [137]:
df.apply(lambda x: list(x.dropna()), axis=1).to_json()
Out[137]:
'{"a":{"0":1.0,"1":5.0,"2":3.0},"b":{"0":4.0,"1":5.0,"2":3.0},"c":{"0":7.0,"1":5.0,"2":3.0}}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With