Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas compare next row

I have a dataframe like this

d={} d['z']=['Q8','Q8','Q7','Q9','Q9'] d['t']=['10:30','10:31','10:38','10:40','10:41'] d['qty']=[20,20,9,12,12] 

I want compare first row with second row

  1. is qty same as next row AND
  2. is t greater in the next row AND
  3. is z value same as next row

The desired value is

   qty                   t   z  valid 0   20 2015-06-05 10:30:00  Q8  False 1   20 2015-06-05 10:31:00  Q8   True 2    9 2015-06-05 10:38:00  Q7  False 3   12 2015-06-05 10:40:00  Q9  False 4   12 2015-06-05 10:41:00  Q9   True 
like image 535
NinjaGaiden Avatar asked Jun 05 '15 18:06

NinjaGaiden


People also ask

How do I compare rows in pandas?

You can use the DataFrame. diff() function to find the difference between two rows in a pandas DataFrame.

Is Iterrows faster than apply?

This solution also uses looping to get the job done, but apply has been optimized better than iterrows , which results in faster runtimes. See below for an example of how we could use apply for labeling the species in each row.

How do you use Iterrows in pandas?

Pandas DataFrame iterrows() MethodThe iterrows() method generates an iterator object of the DataFrame, allowing us to iterate each row in the DataFrame. Each iteration produces an index object and a row object (a Pandas Series object).

How do you know if two rows are equal pandas?

The equals() function is used to test whether two Pandas objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.


1 Answers

Looks like you want to use the Series.shift method.

Using this method, you can generate new columns which are offset to the original columns. Like this:

df['qty_s'] = df['qty'].shift(-1) df['t_s'] = df['t'].shift(-1) df['z_s'] = df['z'].shift(-1) 

Now you can compare these:

df['is_something'] = (df['qty'] == df['qty_s']) & (df['t'] < df['t_s']) & (df['z'] == df['z_s']) 

Here is a simplified example of how Series.shift works to compare next row to the current:

df = pd.DataFrame({"temp_celcius":pd.np.random.choice(10, 10) + 20}, index=pd.date_range("2015-05-15", "2015-05-24"))  df             temp_celcius  2015-05-15            21 2015-05-16            28 2015-05-17            27 2015-05-18            21 2015-05-19            25 2015-05-20            28 2015-05-21            25 2015-05-22            22 2015-05-23            29 2015-05-24            25  df["temp_c_yesterday"] = df["temp_celcius"].shift(1) df             temp_celcius  temp_c_yesterday 2015-05-15            21               NaN 2015-05-16            28                21 2015-05-17            27                28 2015-05-18            21                27 2015-05-19            25                21 2015-05-20            28                25 2015-05-21            25                28 2015-05-22            22                25 2015-05-23            29                22 2015-05-24            25                29  df["warmer_than_yesterday"] = df["temp_celcius"] > df["temp_c_yesterday"]             temp_celcius  temp_c_yesterday warmer_than_yesterday 2015-05-15            21               NaN                 False 2015-05-16            28                21                  True 2015-05-17            27                28                 False 2015-05-18            21                27                 False 2015-05-19            25                21                  True 2015-05-20            28                25                  True 2015-05-21            25                28                 False 2015-05-22            22                25                 False 2015-05-23            29                22                  True 2015-05-24            25                29                 False 

If I misunderstood your query, please post a comment and I'll update my answer.

like image 144
firelynx Avatar answered Sep 30 '22 17:09

firelynx