Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copy all values in a column to a new column in a pandas dataframe

Tags:

python

pandas

This is a very basic question, I just can not seem to find an answer.

I have a dataframe like this, called df:

  A     B     C  a.1   b.1   c.1  a.2   b.2   c.2  a.3   b.3   c.3 

Then I extract all the rows from df, where column 'B' has a value of 'b.2'. I assign these results to df_2.

df_2 = df[df['B'] == 'b.2'] 

df_2 becomes:

  A     B     C  a.2   b.2   c.2 

Then, I copy all the values in column 'B' to a new column named 'D'. Causing df_2 to become:

  A     B     C     D  a.2   b.2   c.2   b.2 

When I preform an assignment like this:

df_2['D'] = df_2['B'] 

I get the following warning:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


I have also tried using .loc when creating df_2 like this:

df_2 = df.loc[df['B'] == 'b.2'] 

However, I still get the warning.

Any help is greatly appreciated.

like image 293
Justin Buchanan Avatar asked Sep 20 '15 04:09

Justin Buchanan


People also ask

How do I duplicate a column in a DataFrame?

Use DataFrame. loc[] to Drop Duplicate and Keep First Columns. You can use DataFrame. duplicated () without any arguments to drop columns with the same values on all columns.

How do you change all values in a column in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

How can I replace all values in a DataFrame with another value?

Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])


1 Answers

You can simply assign the B to the new column , Like -

df['D'] = df['B'] 

Example/Demo -

In [1]: import pandas as pd  In [2]: df = pd.DataFrame([['a.1','b.1','c.1'],['a.2','b.2','c.2'],['a.3','b.3','c.3']],columns=['A','B','C'])  In [3]: df Out[3]:      A    B    C 0  a.1  b.1  c.1 1  a.2  b.2  c.2 2  a.3  b.3  c.3  In [4]: df['D'] = df['B']                  #<---What you want.  In [5]: df Out[5]:      A    B    C    D 0  a.1  b.1  c.1  b.1 1  a.2  b.2  c.2  b.2 2  a.3  b.3  c.3  b.3  In [6]: df.loc[0,'D'] = 'd.1'  In [7]: df Out[7]:      A    B    C    D 0  a.1  b.1  c.1  d.1 1  a.2  b.2  c.2  b.2 2  a.3  b.3  c.3  b.3 
like image 126
Anand S Kumar Avatar answered Oct 03 '22 03:10

Anand S Kumar