Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a new column based on values from other columns in a Pandas DataFrame

I'm new to programming and Pandas. Therefore, please do not judge strictly.

To this table, I need to add a new column of values got from other columns.

inp = [{'Date':2003, 'b1':5,'b2':0,'b3':4,'b4':3},{'Date':2003, 'b1':2,'b2':2,'b3':1,'b4':8},{'Date':2004, 'b1':2,'b2':3,'b3':1,'b4':1},{'Date':2004, 'b1':1,'b2':8,'b3':2,'b4':1},{'Date':2005, 'b1':2,'b2':1,'b3':6,'b4':2},{'Date':2006, 'b1':1,'b2':7,'b3':2,'b4':9}]
df = pd.DataFrame(inp)
print (df)

   Date  b1  b2  b3  b4
0  2003   5   0   4   3
1  2003   2   2   1   8
2  2004   2   3   1   1
3  2004   1   8   2   1
4  2005   2   1   6   2
5  2006   1   7   2   9

Namely, depending on the date. That is if the value of column "Date" == 2003 - I need to get the value from column b1, if the value of column "Date" = 2004 then I need to get the value from column b2, "Date" = 2004 - column b3 and so on. So the values of new column should be: 5,2,3,8,6,9.

I have a dictionary of correspondences smt. like:

Corr_dict = {2003:'b1',2004:'b2',2005:'b4',2006:'b7'...}

This is just an example. I have a large dataset, so I want to understand the mechanics.

Sorry for the poor question formatting. I will be very grateful for any help.

expected output

   Date  b1  b2  b3  b4  vals
0  2003   5   0   4   3   5.0
1  2003   2   2   1   8   2.0
2  2004   2   3   1   1   3.0
3  2004   1   8   2   1   8.0
4  2005   2   1   6   2   6.0
5  2006   1   7   2   9   9.0
like image 396
Roman Perkhaliuk Avatar asked Apr 09 '20 14:04

Roman Perkhaliuk


People also ask

How do you populate a column based on two columns values in a DataFrame?

Create New Columns in Pandas DataFrame Based on the Values of Other Columns Using the DataFrame. apply() Method. It applies the lambda function defined in the apply() method to each row of the DataFrame items_df and finally assigns the series of results to the Final Price column of the DataFrame items_df .

How do I create a conditional column in pandas?

You can create a conditional column in pandas DataFrame by using np. where() , np. select() , DataFrame. map() , DataFrame.

How to apply a method over an existing column in pandas Dataframe?

If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas.DataFrame.apply () method should do the trick. For example, you can define your own method and then pass it to the apply () method.

How do you filter DataFrames in pandas?

Pandas’ loc creates a boolean mask, based on a condition. Sometimes, that condition can just be selecting rows and columns, but it can also be used to filter dataframes. These filtered dataframes can then have values applied to them.

How to assign a custom value to a column in pandas?

Assign a Custom Value to a Column in Pandas. In order to create a new column where every value is the same value, this can be directly applied. For example, if we wanted to add a column for what show each record is from (Westworld), then we can simply write: df['Show'] = 'Westworld'. print(df)

How do I apply a label to another column in pandas?

Using Pandas Map to Set Values in Another Column The Pandas.map () method is very helpful when you’re applying labels to another column. In order to use this method, you define a dictionary to apply to the column. For our sample dataframe, let’s imagine that we have offices in America, Canada, and France.


3 Answers

I'd use df.lookup:

df['Correspond'] = df.lookup(df.index, df['Date'].map(dd))

MCVE:

import pandas as pd

import numpy as np

inp = [{'Date':2003, 'b1':5,'b2':0,'b3':4,'b4':3},{'Date':2003, 'b1':2,'b2':2,'b3':1,'b4':8},{'Date':2004, 'b1':2,'b2':3,'b3':1,'b4':1},{'Date':2004, 'b1':1,'b2':8,'b3':2,'b4':1},{'Date':2005, 'b1':2,'b2':1,'b3':6,'b4':2},{'Date':2006, 'b1':1,'b2':7,'b3':2,'b4':9}]
df = pd.DataFrame(inp)

dd = {2003:'b1', 2004:'b2', 2005:'b3', 2006:'b4'}

df['Correspond'] = df.lookup(df.index, df['Date'].map(dd))
print(df)

output:

   Date  b1  b2  b3  b4  Correspond
0  2003   5   0   4   3           5
1  2003   2   2   1   8           2
2  2004   2   3   1   1           3
3  2004   1   8   2   1           8
4  2005   2   1   6   2           6
5  2006   1   7   2   9           9
like image 164
Scott Boston Avatar answered Oct 19 '22 17:10

Scott Boston


IIUC

s=df.set_index('Date').stack()
df['New']=s[s.index.isin(list(d.items()))].values
like image 2
BENY Avatar answered Oct 19 '22 17:10

BENY


IIUC, I would write a function for that:

def extract(df, year):
    min_year = df['Date'].min()
    return df.loc[df['Date']==year, df.columns[year+1 - min_year]]

extract(df, 2003)
# 0    5
# 1    2
# Name: b1, dtype: int64

And for all year as a colunms:

pd.concat(extract(df, year).rename('new_col') for year in df['Date'].unique())

Output:

0    5
1    2
2    3
3    8
4    6
5    9
Name: new_col, dtype: int64
like image 2
Quang Hoang Avatar answered Oct 19 '22 18:10

Quang Hoang