I have 2 dataframes - one is a data source dataframe and another is reference dataframe. I want to create an additional column in df1 based on the comparison of those 2 dataframes
df1 - data source
No | Name
213344 | Apple
242342 | Orange
234234 | Pineapple
df2 - reference table
RGE_FROM | RGE_TO | Value
2100 | 2190 | Sweet
2200 | 2322 | Bitter
2400 | 5000 | Neutral
final if first 4 character of df1.No fall between the range of df2.RGE_FROM to df2.RGE_TO, get df2.Value for the derived column df.DESC. else, blank
No | Name | DESC
213344 | Apple | Sweet
242342 | Orange | Natural
234234 | Pineapple |
Any help is appreciated! Thank you!
Create New Columns in Pandas DataFrame Based on the Values of Other Columns Using the DataFrame. apply() Method. It applies the lambda function defined in the apply() method to each row of the DataFrame items_df and finally assigns the series of results to the Final Price column of the DataFrame items_df .
A DataFrame is a 2D structure composed of rows and columns, and where data is stored into a tubular form. It is mutable in terms of size, and heterogeneous tabular data. Arithmetic operations can also be performed on both row and column labels. To know more about the creation of Pandas DataFrame.
If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas.DataFrame.apply () method should do the trick. For example, you can define your own method and then pass it to the apply () method.
Solution #1 : We can use Python’s list comprehension technique to achieve this task. List comprehension is mostly faster than other methods. Now we will add a new column called ‘Price’ to the dataframe. For that purpose, we will use list comprehension technique. Set the price to 1500 if the ‘Event’ is ‘Music’ else 800.
As we can see in the output, we have successfully added a new column to the dataframe based on some condition. Solution #3 : We can use DataFrame.map () function to achieve the goal. It is a very straight forward method where we use a dictionary to simply map values to the newly added column based on the key.
We can create an IntervalIndex
from the columns RGE_FROM
and RGE_TO
, then set this as an index of column Value
to create a mapping series, then slice the first four characters in the column No
and using Series.map
substitute the values from the mapping series.
i = pd.IntervalIndex.from_arrays(df2['RGE_FROM'], df2['RGE_TO'], closed='both')
df1['Value'] = df1['No'].astype(str).str[:4].astype(int).map(df2.set_index(i)['Value'])
No Name Value
0 213344 Apple Sweet
1 242342 Orange Neutral
2 234234 Pineapple NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With