Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting specific selected columns to new DataFrame as a copy

I have a pandas DataFrame with 4 columns and I want to create a new DataFrame that only has three of the columns. This question is similar to: Extracting specific columns from a data frame but for pandas not R. The following code does not work, raises an error, and is certainly not the pandasnic way to do it.

import pandas as pd old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]}) new = pd.DataFrame(zip(old.A, old.C, old.D)) # raises TypeError: data argument can't be an iterator  

What is the pandasnic way to do it?

like image 646
SpeedCoder5 Avatar asked Jan 08 '16 17:01

SpeedCoder5


People also ask

How do you create a new DataFrame with only selected columns?

You can create a new DataFrame of a specific column by using DataFrame. assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.

How do I select only certain columns in a DataFrame?

If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .


1 Answers

There is a way of doing this and it actually looks similar to R

new = old[['A', 'C', 'D']].copy() 

Here you are just selecting the columns you want from the original data frame and creating a variable for those. If you want to modify the new dataframe at all you'll probably want to use .copy() to avoid a SettingWithCopyWarning.

An alternative method is to use filter which will create a copy by default:

new = old.filter(['A','B','D'], axis=1) 

Finally, depending on the number of columns in your original dataframe, it might be more succinct to express this using a drop (this will also create a copy by default):

new = old.drop('B', axis=1) 
like image 53
johnchase Avatar answered Sep 30 '22 17:09

johnchase