I'm new to programming and Pandas. Therefore, please do not judge strictly.
To this table, I need to add a new column of values got from other columns.
inp = [{'Date':2003, 'b1':5,'b2':0,'b3':4,'b4':3},{'Date':2003, 'b1':2,'b2':2,'b3':1,'b4':8},{'Date':2004, 'b1':2,'b2':3,'b3':1,'b4':1},{'Date':2004, 'b1':1,'b2':8,'b3':2,'b4':1},{'Date':2005, 'b1':2,'b2':1,'b3':6,'b4':2},{'Date':2006, 'b1':1,'b2':7,'b3':2,'b4':9}]
df = pd.DataFrame(inp)
print (df)
   Date  b1  b2  b3  b4
0  2003   5   0   4   3
1  2003   2   2   1   8
2  2004   2   3   1   1
3  2004   1   8   2   1
4  2005   2   1   6   2
5  2006   1   7   2   9
Namely, depending on the date. That is if the value of column "Date" == 2003 - I need to get the value from column b1, if the value of column "Date" = 2004 then I need to get the value from column b2, "Date" = 2004 - column b3 and so on. So the values of new column should be: 5,2,3,8,6,9.
I have a dictionary of correspondences smt. like:
Corr_dict = {2003:'b1',2004:'b2',2005:'b4',2006:'b7'...}
This is just an example. I have a large dataset, so I want to understand the mechanics.
Sorry for the poor question formatting. I will be very grateful for any help.
expected output
   Date  b1  b2  b3  b4  vals
0  2003   5   0   4   3   5.0
1  2003   2   2   1   8   2.0
2  2004   2   3   1   1   3.0
3  2004   1   8   2   1   8.0
4  2005   2   1   6   2   6.0
5  2006   1   7   2   9   9.0
                Create New Columns in Pandas DataFrame Based on the Values of Other Columns Using the DataFrame. apply() Method. It applies the lambda function defined in the apply() method to each row of the DataFrame items_df and finally assigns the series of results to the Final Price column of the DataFrame items_df .
You can create a conditional column in pandas DataFrame by using np. where() , np. select() , DataFrame. map() , DataFrame.
If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas.DataFrame.apply () method should do the trick. For example, you can define your own method and then pass it to the apply () method.
Pandas’ loc creates a boolean mask, based on a condition. Sometimes, that condition can just be selecting rows and columns, but it can also be used to filter dataframes. These filtered dataframes can then have values applied to them.
Assign a Custom Value to a Column in Pandas. In order to create a new column where every value is the same value, this can be directly applied. For example, if we wanted to add a column for what show each record is from (Westworld), then we can simply write: df['Show'] = 'Westworld'. print(df)
Using Pandas Map to Set Values in Another Column The Pandas.map () method is very helpful when you’re applying labels to another column. In order to use this method, you define a dictionary to apply to the column. For our sample dataframe, let’s imagine that we have offices in America, Canada, and France.
I'd use df.lookup:
df['Correspond'] = df.lookup(df.index, df['Date'].map(dd))
MCVE:
import pandas as pd
import numpy as np
inp = [{'Date':2003, 'b1':5,'b2':0,'b3':4,'b4':3},{'Date':2003, 'b1':2,'b2':2,'b3':1,'b4':8},{'Date':2004, 'b1':2,'b2':3,'b3':1,'b4':1},{'Date':2004, 'b1':1,'b2':8,'b3':2,'b4':1},{'Date':2005, 'b1':2,'b2':1,'b3':6,'b4':2},{'Date':2006, 'b1':1,'b2':7,'b3':2,'b4':9}]
df = pd.DataFrame(inp)
dd = {2003:'b1', 2004:'b2', 2005:'b3', 2006:'b4'}
df['Correspond'] = df.lookup(df.index, df['Date'].map(dd))
print(df)
output:
   Date  b1  b2  b3  b4  Correspond
0  2003   5   0   4   3           5
1  2003   2   2   1   8           2
2  2004   2   3   1   1           3
3  2004   1   8   2   1           8
4  2005   2   1   6   2           6
5  2006   1   7   2   9           9
                        IIUC
s=df.set_index('Date').stack()
df['New']=s[s.index.isin(list(d.items()))].values
                        IIUC, I would write a function for that:
def extract(df, year):
    min_year = df['Date'].min()
    return df.loc[df['Date']==year, df.columns[year+1 - min_year]]
extract(df, 2003)
# 0    5
# 1    2
# Name: b1, dtype: int64
And for all year as a colunms:
pd.concat(extract(df, year).rename('new_col') for year in df['Date'].unique())
Output:
0    5
1    2
2    3
3    8
4    6
5    9
Name: new_col, dtype: int64
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With