I have df:
Hour Energy Wh
1 4
2 6
3 9
4 15
I would like to add a column that shows the per hour difference. I am using this:
df['Energy Wh/h'] = df['Energy Wh'].diff().fillna(0)
df1:
Hour Energy Wh Energy Wh/h
1 4 0
2 6 2
3 9 3
4 15 6
However, the Hour 1 value is showing up as 0 in the Energy Wh/h column, whereas I would like it to show up as 4, like below:
Hour Energy Wh Energy Wh/h
1 4 4
2 6 2
3 9 3
4 15 6
I have tried using np.where:
df['Energy Wh/h'] = np.where(df['Hour'] == 1,df['Energy Wh'].diff().fillna(df['Energy Wh']),df['Energy Wh'].diff().fillna(0))
but I am still getting a 0 value in the hour 1 row (df1), with no errors. How do I get the value in 'Energy Wh' for Hour 1 to be filled, instead of 0?
Pandas DataFrame diff() Method The diff() method returns a DataFrame with the difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.
Pandas DataFrame first() Method The first() method returns the first n rows, based on the specified value. The index have to be dates for this method to work as expected.
diff(arr[, n[, axis]]) function is used when we calculate the n-th order discrete difference along the given axis. The first order difference is given by out[i] = arr[i+1] – arr[i] along the given axis. If we have to calculate higher differences, we are using diff recursively. Syntax: numpy.diff()
You can just fillna()
with the original column, without using np.where
:
>>> df['Energy Wh/h'] = df['Energy Wh'].diff().fillna(df['Energy Wh'])
>>> df
Energy Wh Energy Wh/h
Hour
1 4 4.0
2 6 2.0
3 9 3.0
4 15 6.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With