Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas.to_dict returns None mixed with nan

I've stumbled upon a minor problem with pandas and it's method to_dict. I have a table that I'm certain have equal number of identical columns in each row, let's say it looks like that:

+----|----|----+
|COL1|COL2|COL3|
+----|----|----+
|VAL1|    |VAL3|
|    |VAL2|VAL3|
|VAL1|VAL2|    |
+----|----|----+

When I do df.to_dict(orient='records') I get:

[{
     "COL1":"VAL1"
     ,"COL2":nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":nan
}]

Notice nan's in some columns and None's in other (always the same, there appears to be no nan and None in same column)

And when I do json.loads(df.to_json(orient='records')) i get only None and no nan's (which is desired output).

Like this:

[{
     "COL1":"VAL1"
     ,"COL2":None
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":None
}]

I would appreciate some explanation as to why it happens and if it can be controlled in some way.

==EDIT==

According to comments it would be better to first replace those nan's with None's, but those nan's are not np.nan:

>>> a = df.head().ix[0,60]
>>> a
nan
>>> type(a)
<class 'numpy.float64'>
>>> a is np.nan
False
>>> a == np.nan
False
like image 512
Piotr Kamoda Avatar asked Apr 06 '17 12:04

Piotr Kamoda


People also ask

Are pandas None and NaN the same?

Despite the data type difference of NaN and None , Pandas treat numpy. nan and None similarly. For an example, we create a pandas. DataFrame by reading in a csv file.

How do I fix NaN in pandas?

If you want to treat the value as a missing value, you can use the replace() method to replace it with float('nan') , np. nan , and math. nan .

How do I replace NaN with NP NaN?

In NumPy, to replace missing values NaN ( np. nan ) in ndarray with other numbers, use np. nan_to_num() or np. isnan() .


1 Answers

I think you can only replace, it is not possible control in to_dict:

L = [{
     "COL1":"VAL1"
     ,"COL2":np.nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":np.nan
}]

df = pd.DataFrame(L).replace({np.nan:None})
print (df)
   COL1  COL2  COL3
0  VAL1  None  VAL3
1  None  VAL2  VAL3
2  VAL1  VAL2  None

print (df.to_dict(orient='records'))
[{'COL3': 'VAL3', 'COL2': None, 'COL1': 'VAL1'}, 
 {'COL3': 'VAL3', 'COL2': 'VAL2', 'COL1': None}, 
 {'COL3': None, 'COL2': 'VAL2', 'COL1': 'VAL1'}]
like image 159
jezrael Avatar answered Oct 30 '22 21:10

jezrael