In DataFrane.to_csv, I managed to write csv files removing nan
values with
df = df.replace('None','')
df = df.replace('nan','')
but my issue is that with this approach is that every nan values will be replaced with qoutes: ''
is it possible to replace nan values depending on type?
if the nan dataframe == int dont add qoutes
if str set to ''
if float set to 0.0
etc.
tried this code but failed
df['myStringColumn'].replace('None', '')
edit: here is the sample data frame i have
aTest Vendor name price qty
0 y NewVend 21.20 nan
1 y OldMakes 11.20 3
2 nan nan sample 9.20 1
3 n nan make nan 0
here is my goal
'y','NewVend','',21.20,,
'y','OldMakes','',11.20,3,
'','','sample',9.20,1,
'n','','make',0.0,0,
here is the full script
dtype_dic= {'price': float, 'qty': float}
df = pd.read_excel(os.path.join(sys.path[0], d.get('csv')), dtype=str)
for col, col_type in dtype_dic.items():
df[col] = df[col].astype(col_type)
df = df.replace('None','')
df = df.replace('nan','')
df.to_csv('test.csv', index=False, header=False, quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
You can select the columns with required type using select_dtypes and then use fillna if the nan is np.nan, it works for None as well
float_cols = df.select_dtypes(include=['float64']).columns
str_cols = df.select_dtypes(include=['object']).columns
df.loc[:, float_cols] = df.loc[:, float_cols].fillna(0)
df.loc[:, str_cols] = df.loc[:, str_cols].fillna('')
You get
aTest Vendor name price qty
0 y NewVend 21.2 0.0
1 y OldMakes 11.2 3.0
2 sample 9.2 1.0
3 n make 0.0 0.0
Try this!!
data = pd.read_csv("dataset/pokemon.csv")
data.head(7)
# Name Type 1 Type 2 HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary
0 1 Bulbasaur Grass Poison 45 49 49 65 65 45 1 False
1 2 Ivysaur Grass Poison 60 62 63 80 80 60 1 False
2 3 Venusaur Grass Poison 80 82 83 100 100 80 1 False
3 4 Mega Venusaur Grass Poison 80 100 123 122 120 80 1 False
4 5 Charmander Fire NaN 39 52 43 60 50 65 1 False
5 6 Charmeleon Fire NaN 58 64 58 80 65 80 1 False
6 7 Charizard Fire Flying 78 84 78 109 85 100 1 False
To know string type columns:
types = list(data.iloc[0])
str_types=[]
i=0
for a in l1:
if type(a) == str:
str_types.append(i)
i+=1
print str_types
[1, 2, 3] #columns with string values
Replace NaN with " ":
for a in str_types:
data.iloc[:,a].fillna(" ",inplace =True)
data
# Name Type 1 Type 2 HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary
0 1 Bulbasaur Grass Poison 45 49 49 65 65 45 1 False
1 2 Ivysaur Grass Poison 60 62 63 80 80 60 1 False
2 3 Venusaur Grass Poison 80 82 83 100 100 80 1 False
3 4 Mega Venusaur Grass Poison 80 100 123 122 120 80 1 False
4 5 Charmander Fire 39 52 43 60 50 65 1 False
5 6 Charmeleon Fire 58 64 58 80 65 80 1 False
6 7 Charizard Fire Flying 78 84 78 109 85 100 1 False
7 8 Mega Charizard X Fire Dragon 78 130 111 130 85 100 1 False
8 9 Mega Charizard Y Fire Flying 78 104 78 159 115 100 1 False
9 10 Squirtle Water 44 48 65 50 64 43 1 False
10 11 Wartortle Water 59 63 80 65 80 58 1 False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With