I have the following dataframe df:
Customer_ID | 2015 | 2016 |2017 | Year_joined_mailing
ABC 5 6 10 2015
BCD 6 7 3 2016
DEF 10 4 5 2017
GHI 8 7 10 2016
I would like to look up the value of the customer in the year they joined the mailing list and save it in a new column.
Output would be:
Customer_ID | 2015 | 2016 |2017 | Year_joined_mailing | Purchases_1st_year
ABC 5 6 10 2015 5
BCD 6 7 3 2016 7
DEF 10 4 5 2017 5
GHI 8 9 10 2016 9
I have found some solutions for match vlookup in python, but none that would use the headers of other columns.
I would do it like this, assuming that the column headers and the Year_joined_mailing
are the same data type and that all Year_joined_mailing
values are valid columns. If the datatypes are not the same, you could convert it by adding str()
or int()
where appropriate.
df['Purchases_1st_year'] = [df[df['Year_joined_mailing'][i]][i] for i in df.index]
What we're doing here is iterating over the indexes in the dataframe to get the 'Year_joined_mailing'
field for that index, then using that to get the column we want, and again selecting that index from the column, pushing it all to a list and assigning this to our new column 'Year_joined_mailing'
If your 'Year_joined_mailing'
column will not always be a valid column name, then try:
from numpy import nan
new_col = []
for i in df.index:
try:
new_col.append(df[df['Year_joined_mailing'][i]][i])
except IndexError:
new_col.append(nan) #or whatever null value you want here)
df['Purchases_1st_year'] = new_col
This longer code snippet accomplishes the same thing, but will not break if 'Year_joined_mailing'
is not in df.columns
you can apply "apply" to each row
df.apply(lambda x: x[x['Year_joined_mailing']],axis=1)
Deprecation Notice:
lookup
was deprecated in v1.2.0
Use pd.DataFrame.lookup
Keep in mind that I'm assuming Customer_ID
is the index.
df.lookup(df.index, df.Year_joined_mailing)
array([5, 7, 5, 7])
df.assign(
Purchases_1st_year=df.lookup(df.index, df.Year_joined_mailing)
)
2015 2016 2017 Year_joined_mailing Purchases_1st_year
Customer_ID
ABC 5 6 10 2015 5
BCD 6 7 3 2016 7
DEF 10 4 5 2017 5
GHI 8 7 10 2016 7
However, you have to be careful with comparing possible strings in the column names and integers in the first year column...
Nuclear option to ensure type comparisons are respected.
df.assign(
Purchases_1st_year=df.rename(columns=str).lookup(
df.index, df.Year_joined_mailing.astype(str)
)
)
2015 2016 2017 Year_joined_mailing Purchases_1st_year
Customer_ID
ABC 5 6 10 2015 5
BCD 6 7 3 2016 7
DEF 10 4 5 2017 5
GHI 8 7 10 2016 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With