Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas diff() giving 0 value for first difference, I want the actual value instead

I have df:

Hour  Energy Wh  
1        4          
2        6           
3        9
4        15

I would like to add a column that shows the per hour difference. I am using this:

df['Energy Wh/h'] = df['Energy Wh'].diff().fillna(0)

df1:

Hour  Energy Wh  Energy Wh/h
1        4          0
2        6          2 
3        9          3
4        15         6

However, the Hour 1 value is showing up as 0 in the Energy Wh/h column, whereas I would like it to show up as 4, like below:

Hour  Energy Wh  Energy Wh/h
1        4          4
2        6          2 
3        9          3
4        15         6

I have tried using np.where:

df['Energy Wh/h']  = np.where(df['Hour'] == 1,df['Energy Wh'].diff().fillna(df['Energy Wh']),df['Energy Wh'].diff().fillna(0))

but I am still getting a 0 value in the hour 1 row (df1), with no errors. How do I get the value in 'Energy Wh' for Hour 1 to be filled, instead of 0?

like image 652
warrenfitzhenry Avatar asked Mar 12 '17 14:03

warrenfitzhenry


People also ask

What does diff () do in pandas?

Pandas DataFrame diff() Method The diff() method returns a DataFrame with the difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.

What does First () do in pandas?

Pandas DataFrame first() Method The first() method returns the first n rows, based on the specified value. The index have to be dates for this method to work as expected.

What is diff () in Python?

diff(arr[, n[, axis]]) function is used when we calculate the n-th order discrete difference along the given axis. The first order difference is given by out[i] = arr[i+1] – arr[i] along the given axis. If we have to calculate higher differences, we are using diff recursively. Syntax: numpy.diff()


1 Answers

You can just fillna() with the original column, without using np.where:

>>> df['Energy Wh/h'] = df['Energy Wh'].diff().fillna(df['Energy Wh'])
>>> df
      Energy Wh  Energy Wh/h
Hour
   1          4          4.0
   2          6          2.0
   3          9          3.0
   4         15          6.0
like image 162
AChampion Avatar answered Sep 19 '22 21:09

AChampion