Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Dataframe object types fillna exception over different datatypes

Tags:

python

pandas

I have a Pandas Dataframe with different dtypes for the different columns. E.g. df.dtypes returns the following.

Date                    datetime64[ns]
FundID                           int64
FundName                        object
CumPos                           int64
MTMPrice                       float64
PricingMechanism                object

Various of cheese columns have missing values in them. Doing a group operations on it with NaN values in place cause problems. To get rid of them with the .fillna() method is the obvious choice. Problem is the obvious clouse for strings are .fillna("") while .fillna(0) is the correct choice for ints and floats. Using either method on DataFrame throws exception. Any elegant solutions besides doing them individually (have about 30 columns)? I have a lot of code depending on the DataFrame and would prefer not to retype the columns as it is likely to break some other logic. Can do:

df.FundID.fillna(0)
df.FundName.fillna("")
etc
like image 875
Joop Avatar asked Jun 18 '13 15:06

Joop


People also ask

Can a pandas series object hold data of different types?

In the same way you can't attach a specific data type to list , even if all elements are of the same type, a Pandas object series contains pointers to any number of types.

Can pandas DataFrame have different data types?

Pandas uses other names for data types than Python, for example: object for textual data. A column in a DataFrame can only have one data type. The data type in a DataFrame's single column can be checked using dtype . Make conscious decisions about how to manage missing data.

Can DataFrame hold multiple types of data?

A DataFrame is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns. It is similar to a spreadsheet, a SQL table or the data. frame in R. The table has 3 columns, each of them with a column label.

What about using the Fillna () method of the DataFrame?

The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.


1 Answers

similar to @Guddi: A bit verbose, but still more concise then @Ryan's answer and keeping all columns:

df[df.select_dtypes("object").columns] = df.select_dtypes("object").fillna("")
like image 83
nik Avatar answered Oct 21 '22 17:10

nik