Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save pandas dataframe but conserving NA values

I have this code

import pandas as pd
import numpy as np
import csv
df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
               'size': list('SSMMMLL'),
               'weight': [8, 10, 11, 1, 20, 12, 12],
               'adult' : [False] * 5 + [True] * 2}); 

And I changed the weight with NA values:

df['weight'] = np.nan

And finally I saved it

df.to_csv("ejemplo.csv", sep=";", decimal=",", quoting=csv.QUOTE_NONNUMERIC, index=False)

But when I read the file I have "", instead of NA I want to put NA instead of Nan

I want as output:

adult;animal;size;weight
False;"dog";"S";NA
False;"cat";"M";NA    
like image 883
Náthali Avatar asked Apr 04 '16 14:04

Náthali


People also ask

How do you deal with missing values in pandas DataFrame?

In order to check missing values in Pandas DataFrame, we use a function isnull() and notnull(). Both function help in checking whether a value is NaN or not. These function can also be used in Pandas Series in order to find null values in a series.

Does Panda read NaN na?

This is what Pandas documentation gives: na_values : scalar, str, list-like, or dict, optional Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.


1 Answers

If you want a string to represent NaN values then pass na_rep to to_csv:

In [8]:
df.to_csv(na_rep='NA')

Out[8]:
',adult,animal,size,weight\n0,False,cat,S,NA\n1,False,dog,S,NA\n2,False,cat,M,NA\n3,False,fish,M,NA\n4,False,dog,M,NA\n5,True,cat,L,NA\n6,True,cat,L,NA\n'

If you want the NA in quotes then escape the quotes:

In [3]:
df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
               'size': list('SSMMMLL'),
               'weight': [8, 10, 11, 1, 20, 12, 12],
               'adult' : [False] * 5 + [True] * 2})
df['weight'] = np.NaN
df.to_csv(na_rep='\'NA\'')

Out[3]:
",adult,animal,size,weight\n0,False,cat,S,'NA'\n1,False,dog,S,'NA'\n2,False,cat,M,'NA'\n3,False,fish,M,'NA'\n4,False,dog,M,'NA'\n5,True,cat,L,'NA'\n6,True,cat,L,'NA'\n"

EDIT

To get the desired output use these params:

In [27]:
df.to_csv(na_rep='NA', sep=';', index=False,quoting=3)
​
Out[27]:
'adult;animal;size;weight\nFalse;cat;S;NA\nFalse;dog;S;NA\nFalse;cat;M;NA\nFalse;fish;M;NA\nFalse;dog;M;NA\nTrue;cat;L;NA\nTrue;cat;L;NA\n'
like image 153
EdChum Avatar answered Oct 17 '22 20:10

EdChum