How can I flag a row in a dataframe every time a column change its string value?
Ex:
Input
ColumnA ColumnB 1 Blue 2 Blue 3 Red 4 Red 5 Yellow # diff won't work here with strings.... only works in numerical values dataframe['changed'] = dataframe['ColumnB'].diff() ColumnA ColumnB changed 1 Blue 0 2 Blue 0 3 Red 1 4 Red 0 5 Yellow 1
The diff() method returns a DataFrame with the difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.
Pandas offers a number of functions related to adjusting rows and enabling you to calculate the difference between them. For example, the Pandas shift method allows us to shift a dataframe in different directions, for example up and down. Because of this, we can easily use the shift method to subtract between rows.
You can use the DataFrame. diff() function to find the difference between two rows in a pandas DataFrame. where: periods: The number of previous rows for calculating the difference.
Algorithm. Step 1: Define two Pandas series, s1 and s2. Step 2: Compare the series using compare() function in the Pandas series. Step 3: Print their difference.
Use .shift
and compare:
dataframe['changed'] = dataframe['ColumnB'] == dataframe['ColumnB'].shift(1).fillna(dataframe['ColumnB'])
I get better performance with ne
instead of using the actual !=
comparison:
df['changed'] = df['ColumnB'].ne(df['ColumnB'].shift().bfill()).astype(int)
Timings
Using the following setup to produce a larger dataframe:
df = pd.concat([df]*10**5, ignore_index=True)
I get the following timings:
%timeit df['ColumnB'].ne(df['ColumnB'].shift().bfill()).astype(int) 10 loops, best of 3: 38.1 ms per loop %timeit (df.ColumnB != df.ColumnB.shift()).astype(int) 10 loops, best of 3: 77.7 ms per loop %timeit df['ColumnB'] == df['ColumnB'].shift(1).fillna(df['ColumnB']) 10 loops, best of 3: 99.6 ms per loop %timeit (df.ColumnB.ne(df.ColumnB.shift())).astype(int) 10 loops, best of 3: 19.3 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With