I was wondering if there is an elegant and shorthand way in Pandas DataFrames to select columns by data type (dtype). i.e. Select only int64 columns from a DataFrame.
To elaborate, something along the lines of
df.select_columns(dtype=float64)
Thanks in advance for the help
Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.
Since 0.14.1 there's a select_dtypes
method so you can do this more elegantly/generally.
In [11]: df = pd.DataFrame([[1, 2.2, 'three']], columns=['A', 'B', 'C']) In [12]: df.select_dtypes(include=['int']) Out[12]: A 0 1
To select all numeric types use the numpy dtype numpy.number
In [13]: df.select_dtypes(include=[np.number]) Out[13]: A B 0 1 2.2 In [14]: df.select_dtypes(exclude=[object]) Out[14]: A B 0 1 2.2
df.loc[:, df.dtypes == np.float64]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With