Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drop non-numeric columns from a pandas DataFrame [duplicate]

Tags:

python

pandas

In my application I load text files that are structured as follows:

  • First non numeric column (ID)
  • A number of non-numeric columns (strings)
  • A number of numeric columns (floats)

The number of the non-numeric columns is variable. Currently I load the data into a DataFrame like this:

source = pandas.read_table(inputfile, index_col=0) 

I would like to drop all non-numeric columns in one fell swoop, without knowing their names or indices, since this could be doable reading their dtype. Is this possible with pandas or do I have to cook up something on my own?

like image 662
Einar Avatar asked Oct 04 '12 10:10

Einar


People also ask

How do I get rid of non numeric values in Pandas?

Use re. sub() to remove all non-numeric characters from a string. Call re. sub(pattern, replacement, string) with "[^0-9]" as pattern , the empty string as replacement , and the string as string to return a copy of the string stripped of all non-numeric characters.

How do I delete a categorical column in Python?

Step 1: Drop columns with categorical dataUse the code cell below to preprocess the data in X_train and X_valid to remove columns with categorical data. Set the preprocessed DataFrames to drop_X_train and drop_X_valid , respectively. Run the next code cell to get the MAE for this approach.


2 Answers

To avoid using a private method you can also use select_dtypes, where you can either include or exclude the dtypes you want.

Ran into it on this post on the exact same thing.

Or in your case, specifically:
source.select_dtypes(['number']) or source.select_dtypes([np.number]

like image 86
sapo_cosmico Avatar answered Sep 25 '22 08:09

sapo_cosmico


It`s a private method, but it will do the trick: source._get_numeric_data()

In [2]: import pandas as pd  In [3]: source = pd.DataFrame({'A': ['foo', 'bar'], 'B': [1, 2], 'C': [(1,2), (3,4)]})  In [4]: source Out[4]:      A  B       C 0  foo  1  (1, 2) 1  bar  2  (3, 4)  In [5]: source._get_numeric_data() Out[5]:    B 0  1 1  2 
like image 38
Wouter Overmeire Avatar answered Sep 25 '22 08:09

Wouter Overmeire