I have a pandas DataFrame I want to to convert into a time table (for visualization purposes) by using groupby and adding an arbitrary number of columns based on hour time increments, and populating the data from a 3rd column.
The source DataFrame might look like:
ID Hour Floor
Jay 2 34
Jay 3 34
Tim 0 36
Tim 1 34
Tim 2 36
Tom 3 32
Tom 4 36
Rob 3 31
Rob 4 32
Rob 5 33
Rob 6 34
...
What I am aiming for is:
ID HOUR_0 HOUR_1 HOUR_2 HOUR_3 HOUR_4 HOUR_5 HOUR_6...
Jay 0 0 34 34 0 0 0
Tim 36 34 36 0 0 0 0
Tom 0 0 0 32 36 0 0
Rob 0 0 0 31 32 33 34
What I can't get (without manually constructing this using loops) is adding an arbitrary number of columns (after a groupby operation) based on the unique or range of hours in the first DataFrame, and then calculating each column value based on on the Hour and Floor columns from the first DataFrame.
Any ideas?
Because I can't help but show how this works with pd.factorize
i, r = pd.factorize(df.ID)
j, c = pd.factorize(df.Hour, sort=True)
b = np.zeros((r.size, c.size), df.Floor.dtype)
b[i, j] = df.Floor.values
d = pd.DataFrame(b, r, [f'Hour_{h}' for h in c])
d
Hour_0 Hour_1 Hour_2 Hour_3 Hour_4 Hour_5 Hour_6
Jay 0 0 34 34 0 0 0
Tim 36 34 36 0 0 0 0
Tom 0 0 0 32 36 0 0
Rob 0 0 0 31 32 33 34
Is this simple pivot ?
df.pivot(*df.columns).fillna(0).add_prefix('Hour_')
Out[71]:
Hour Hour_0 Hour_1 Hour_2 Hour_3 Hour_4 Hour_5 Hour_6
ID
Jay 0.0 0.0 34.0 34.0 0.0 0.0 0.0
Rob 0.0 0.0 0.0 31.0 32.0 33.0 34.0
Tim 36.0 34.0 36.0 0.0 0.0 0.0 0.0
Tom 0.0 0.0 0.0 32.0 36.0 0.0 0.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With