I have a dataset with the exact time and quantity of e-vehicles charging at a charging station and its charging power in kW.
What I am trying to do is to calculate how many e-vehicles are charging at the exact same time (taking into account start
and end
times), and according to that, calculate the required charging power (sum up the power_kW
) needed for each hour. For example, from 18-19h
I need only 100kW of power.
Below I shared a part of the data, I obviously have the data for all the 24 hours, and for each hour I should have a total needed charging station power in kW.
Obviously, if one vehicle finishes charging at 20:50, and another one starts at 20:50 - they overlap and I need to count 2*power_kW
. Also, in some slots, I have more than 1 vehicle starting and finishing at the same time.
I find it hard to do it fast and automatically with the code, given that the times are in hour:minute
format, so I seek help here. I checked similar posts and couldn't find anything to bring me to a solution on my own.
Does someone have an idea how to do this efficently and correctly?
Input data:
import pandas as pd
mycolumns = ['start_timeslot', 'start', 'end', 'power_kW', 'vehicle_count']
data = [['18H-19H', '18:26', '20:27', 100, 1], ['19H-20H', '19:06', '21:16', 100, 1],
['19H-20H', '19:19', '21:16', 100, 1], ['19H-20H', '19:36', '21:33', 100, 1],
['19H-20H', '19:54', '20:19', 100, 2], ['19H-20H', '19:55', '22:01', 100, 1],
['20H-21H', '20:04', '22:06', 100, 1], ['20H-21H', '20:11', '22:04', 100, 2],
['20H-21H', '20:15', '22:04', 100, 1], ['20H-21H', '20:16', '22:04', 100, 1],
['20H-21H', '20:17', '22:08', 100, 1], ['20H-21H', '20:19', '22:09', 100, 1],
['20H-21H', '20:20', '22:01', 100, 2], ['20H-21H', '20:22', '22:35', 100, 1],
['20H-21H', '20:28', '22:34', 100, 2], ['20H-21H', '20:29', '22:22', 100, 1],
['20H-21H', '20:30', '22:14', 100, 1], ['20H-21H', '20:31', '22:10', 100, 1],
['20H-21H', '20:37', '22:31', 100, 1], ['20H-21H', '20:41', '22:29', 100, 2],
['20H-21H', '20:43', '22:34', 100, 1], ['20H-21H', '20:46', '22:39', 100, 1],
['20H-21H', '20:47', '22:35', 100, 1], ['20H-21H', '20:52', '22:34', 100, 1],
['20H-21H', '20:52', '23:09', 100, 1], ['20H-21H', '20:53', '21:59', 100, 1],
['20H-21H', '20:54', '21:49', 100, 1], ['20H-21H', '20:56', '22:10', 100, 1],
['20H-21H', '20:56', '22:55', 100, 1], ['21H-22H', '21:03', '22:51', 100, 1],
['21H-22H', '21:05', '23:12', 100, 1], ['21H-22H', '21:08', '22:59', 100, 1],
['21H-22H', '21:10', '23:27', 100, 1], ['21H-22H', '21:10', '23:30', 100, 1],
['21H-22H', '21:15', '23:23', 100, 1], ['21H-22H', '21:19', '21:56', 100, 1],
['21H-22H', '21:21', '22:48', 100, 1], ['21H-22H', '21:25', '23:26', 100, 1],
['21H-22H', '21:25', '23:32', 100, 1], ['21H-22H', '21:27', '22:55', 100, 1],
['21H-22H', '21:27', '23:32', 100, 1], ['21H-22H', '21:33', '23:11', 100, 1],
['21H-22H', '21:37', '23:04', 100, 1], ['21H-22H', '21:39', '00:05', 100, 1],
['21H-22H', '21:40', '23:08', 100, 1], ['21H-22H', '21:45', '23:04', 100, 1],
['21H-22H', '21:49', '00:06', 100, 1], ['21H-22H', '21:54', '00:07', 100, 1],
['21H-22H', '21:58', '00:02', 100, 1], ['21H-22H', '21:58', '00:24', 100, 1],
['22H-23H', '22:10', '00:19', 100, 1]]
df = pd.DataFrame(data, columns=mycolumns)
df
start_timeslot start end power_kW vehicle_count
0 18H-19H 18:26 20:27 100 1
1 19H-20H 19:06 21:16 100 1
2 19H-20H 19:19 21:16 100 1
3 19H-20H 19:36 21:33 100 1
4 19H-20H 19:54 20:19 100 2
5 19H-20H 19:55 22:01 100 1
6 20H-21H 20:04 22:06 100 1
7 20H-21H 20:11 22:04 100 2
8 20H-21H 20:15 22:04 100 1
9 20H-21H 20:16 22:04 100 1
10 20H-21H 20:17 22:08 100 1
11 20H-21H 20:19 22:09 100 1
12 20H-21H 20:20 22:01 100 2
13 20H-21H 20:22 22:35 100 1
14 20H-21H 20:28 22:34 100 2
15 20H-21H 20:29 22:22 100 1
16 20H-21H 20:30 22:14 100 1
17 20H-21H 20:31 22:10 100 1
18 20H-21H 20:37 22:31 100 1
19 20H-21H 20:41 22:29 100 2
20 20H-21H 20:43 22:34 100 1
21 20H-21H 20:46 22:39 100 1
22 20H-21H 20:47 22:35 100 1
23 20H-21H 20:52 22:34 100 1
24 20H-21H 20:52 23:09 100 1
25 20H-21H 20:53 21:59 100 1
26 20H-21H 20:54 21:49 100 1
27 20H-21H 20:56 22:10 100 1
28 20H-21H 20:56 22:55 100 1
29 21H-22H 21:03 22:51 100 1
30 21H-22H 21:05 23:12 100 1
31 21H-22H 21:08 22:59 100 1
32 21H-22H 21:10 23:27 100 1
33 21H-22H 21:10 23:30 100 1
34 21H-22H 21:15 23:23 100 1
35 21H-22H 21:19 21:56 100 1
36 21H-22H 21:21 22:48 100 1
37 21H-22H 21:25 23:26 100 1
38 21H-22H 21:25 23:32 100 1
39 21H-22H 21:27 22:55 100 1
40 21H-22H 21:27 23:32 100 1
41 21H-22H 21:33 23:11 100 1
42 21H-22H 21:37 23:04 100 1
43 21H-22H 21:39 00:05 100 1
44 21H-22H 21:40 23:08 100 1
45 21H-22H 21:45 23:04 100 1
46 21H-22H 21:49 00:06 100 1
47 21H-22H 21:54 00:07 100 1
48 21H-22H 21:58 00:02 100 1
49 21H-22H 21:58 00:24 100 1
50 22H-23H 22:10 00:19 100 1
Expected output should be:
start_timeslot needed_power_kW
0 18H-19H 100
1 19H-20H 700
2 20H-21H 3100
3 21H-22H 4600
4 22H-23H 4600
5 23H-00H 1800
6 00H-01H 600
I am adding an example from the real dataset, because with the proposed solutions I cannot get the correct answer for all options.
mycolumns = ['start_timeslot', 'start', 'end', 'power_kW', 'vehicle_count']
data = [['00H-01H', '00:05', '00:09', 200.0, 1],
['00H-01H', '00:35', '00:39', 200.0, 1],
['01H-02H', '01:05', '01:09', 200.0, 1],
['05H-06H', '05:34', '05:41', 200.0, 1],
['05H-06H', '05:54', '06:01', 200.0, 1],
['06H-07H', '06:20', '06:27', 200.0, 1],
['06H-07H', '06:44', '06:47', 200.0, 1],
['06H-07H', '06:59', '07:06', 200.0, 1],
['07H-08H', '07:18', '07:22', 200.0, 1],
['07H-08H', '07:36', '07:40', 200.0, 1],
['07H-08H', '07:49', '07:56', 200.0, 1],
['08H-09H', '08:01', '08:05', 200.0, 1],
['08H-09H', '08:08', '08:14', 200.0, 1],
['08H-09H', '08:14', '08:20', 200.0, 1],
['08H-09H', '08:21', '08:26', 200.0, 1],
['08H-09H', '08:28', '08:35', 200.0, 1],
['08H-09H', '08:35', '08:42', 200.0, 1],
['08H-09H', '08:42', '08:46', 200.0, 1],
['08H-09H', '08:49', '08:56', 200.0, 1],
['08H-09H', '08:55', '09:02', 200.0, 1],
['08H-09H', '09:00', '09:04', 200.0, 1],
['09H-10H', '09:07', '09:13', 200.0, 1],
['09H-10H', '09:13', '09:20', 200.0, 1],
['09H-10H', '09:25', '09:29', 200.0, 1],
['09H-10H', '09:31', '09:36', 200.0, 1],
['09H-10H', '09:37', '09:42', 200.0, 1],
['09H-10H', '09:43', '09:47', 200.0, 1],
['09H-10H', '09:49', '09:56', 200.0, 1],
['09H-10H', '09:55', '10:02', 200.0, 1],
['10H-11H', '10:01', '10:05', 200.0, 1],
['10H-11H', '10:06', '10:13', 200.0, 1],
['10H-11H', '10:12', '10:16', 200.0, 1],
['10H-11H', '10:18', '10:23', 200.0, 1],
['10H-11H', '10:23', '10:27', 200.0, 1],
['10H-11H', '10:29', '10:33', 200.0, 1],
['10H-11H', '10:35', '10:39', 200.0, 1],
['10H-11H', '10:41', '10:45', 200.0, 1],
['10H-11H', '10:47', '10:52', 200.0, 1],
['10H-11H', '10:53', '10:57', 200.0, 1],
['10H-11H', '10:59', '11:03', 200.0, 1],
['11H-12H', '11:05', '11:09', 200.0, 1],
['11H-12H', '11:11', '11:15', 200.0, 1],
['11H-12H', '11:17', '11:21', 200.0, 1],
['11H-12H', '11:23', '11:27', 200.0, 1],
['11H-12H', '11:29', '11:33', 200.0, 1],
['11H-12H', '11:35', '11:39', 200.0, 1],
['11H-12H', '11:41', '11:46', 200.0, 1],
['11H-12H', '11:47', '11:51', 200.0, 1],
['11H-12H', '11:53', '11:57', 200.0, 1],
['11H-12H', '11:59', '12:03', 200.0, 1],
['12H-13H', '12:05', '12:09', 200.0, 1],
['12H-13H', '12:11', '12:15', 200.0, 1],
['12H-13H', '12:17', '12:21', 200.0, 1],
['12H-13H', '12:22', '12:26', 200.0, 1],
['12H-13H', '12:28', '12:33', 200.0, 1],
['12H-13H', '12:33', '12:37', 200.0, 1],
['12H-13H', '12:38', '12:43', 200.0, 1],
['12H-13H', '12:43', '12:47', 200.0, 1],
['12H-13H', '12:49', '12:53', 200.0, 1],
['12H-13H', '12:55', '12:59', 200.0, 1],
['13H-14H', '13:01', '13:05', 200.0, 1],
['13H-14H', '13:07', '13:12', 200.0, 1],
['13H-14H', '13:13', '13:17', 200.0, 1],
['13H-14H', '13:19', '13:23', 200.0, 1],
['13H-14H', '13:25', '13:29', 200.0, 1],
['13H-14H', '13:31', '13:35', 200.0, 1],
['13H-14H', '13:37', '13:41', 200.0, 1],
['13H-14H', '13:43', '13:47', 200.0, 1],
['13H-14H', '13:49', '13:53', 200.0, 1],
['13H-14H', '13:55', '13:59', 200.0, 1],
['14H-15H', '14:01', '14:06', 200.0, 1],
['14H-15H', '14:07', '14:11', 200.0, 1],
['14H-15H', '14:13', '14:17', 200.0, 1],
['14H-15H', '14:19', '14:23', 200.0, 1],
['14H-15H', '14:25', '14:29', 200.0, 1],
['14H-15H', '14:31', '14:35', 200.0, 1],
['14H-15H', '14:37', '14:41', 200.0, 1],
['14H-15H', '14:42', '14:46', 200.0, 1],
['14H-15H', '14:48', '14:52', 200.0, 1],
['14H-15H', '14:54', '14:58', 200.0, 1],
['14H-15H', '14:59', '15:03', 200.0, 1],
['15H-16H', '15:04', '15:08', 200.0, 1],
['15H-16H', '15:09', '15:13', 200.0, 1],
['15H-16H', '15:15', '15:19', 200.0, 1],
['15H-16H', '15:21', '15:25', 200.0, 1],
['15H-16H', '15:27', '15:31', 200.0, 1],
['15H-16H', '15:33', '15:37', 200.0, 1],
['15H-16H', '15:39', '15:43', 200.0, 1],
['15H-16H', '15:45', '15:49', 200.0, 1],
['15H-16H', '15:51', '15:55', 200.0, 1],
['15H-16H', '15:57', '16:01', 200.0, 1],
['16H-17H', '16:03', '16:07', 200.0, 1],
['16H-17H', '16:09', '16:13', 200.0, 1],
['16H-17H', '16:15', '16:19', 200.0, 1],
['16H-17H', '16:21', '16:27', 200.0, 1],
['16H-17H', '16:27', '16:31', 200.0, 1],
['16H-17H', '16:33', '16:37', 200.0, 1],
['16H-17H', '16:39', '16:43', 200.0, 1],
['16H-17H', '16:45', '16:49', 200.0, 1],
['16H-17H', '16:51', '16:55', 200.0, 1],
['16H-17H', '16:57', '17:01', 200.0, 1],
['17H-18H', '17:04', '17:08', 200.0, 1],
['17H-18H', '17:10', '17:14', 200.0, 1],
['17H-18H', '17:16', '17:20', 200.0, 1],
['17H-18H', '17:22', '17:26', 200.0, 1],
['17H-18H', '17:28', '17:32', 200.0, 1],
['17H-18H', '17:34', '17:38', 200.0, 1],
['17H-18H', '17:40', '17:44', 200.0, 1],
['17H-18H', '17:46', '17:50', 200.0, 1],
['17H-18H', '17:51', '17:55', 200.0, 1],
['17H-18H', '17:57', '18:01', 200.0, 1],
['18H-19H', '18:03', '18:07', 200.0, 1],
['18H-19H', '18:09', '18:13', 200.0, 1],
['18H-19H', '18:15', '18:19', 200.0, 1],
['18H-19H', '18:21', '18:25', 200.0, 1],
['18H-19H', '18:27', '18:31', 200.0, 1],
['18H-19H', '18:33', '18:37', 200.0, 1],
['18H-19H', '18:39', '18:43', 200.0, 1],
['18H-19H', '18:45', '18:49', 200.0, 1],
['18H-19H', '18:51', '18:55', 200.0, 1],
['18H-19H', '18:57', '19:01', 200.0, 1],
['19H-20H', '19:04', '19:08', 200.0, 1],
['19H-20H', '19:11', '19:15', 200.0, 1],
['19H-20H', '19:18', '19:22', 200.0, 1],
['19H-20H', '19:25', '19:29', 200.0, 1],
['19H-20H', '19:32', '19:36', 200.0, 1],
['19H-20H', '19:39', '19:43', 200.0, 1],
['19H-20H', '19:46', '19:50', 200.0, 1],
['19H-20H', '19:52', '19:56', 200.0, 1],
['20H-21H', '20:07', '20:11', 200.0, 1],
['20H-21H', '20:13', '20:17', 200.0, 1],
['20H-21H', '20:20', '20:24', 200.0, 1],
['20H-21H', '20:28', '20:32', 200.0, 1],
['20H-21H', '20:35', '20:39', 200.0, 1],
['20H-21H', '20:42', '20:46', 200.0, 1],
['20H-21H', '20:49', '20:53', 200.0, 1],
['20H-21H', '20:55', '20:59', 200.0, 1],
['21H-22H', '21:10', '21:14', 200.0, 1],
['21H-22H', '21:23', '21:24', 200.0, 1],
['21H-22H', '21:39', '21:43', 200.0, 1],
['22H-23H', '22:08', '22:12', 200.0, 1],
['22H-23H', '22:38', '22:42', 200.0, 1],
['23H-00H', '23:08', '23:12', 200.0, 1],
['23H-00H', '23:36', '23:40', 200.0, 1]]
df = pd.DataFrame(data, columns=mycolumns)
Pandas GroupBy – Count occurrences in column. Using the size () or count () method with pandas.DataFrame.groupby () will generate the count of a number of occurrences of data present in a particular column of the dataframe. However, this operation can also be performed using pandas.Series.value_counts () and, pandas.Index.value_counts ().
The following code shows how to count the number of occurrences of a specific string in a column of a pandas DataFrame: From the output we can see that the string ‘B’ occurs 4 times in the ‘team’ column. Note that we can also use the following syntax to find how frequently each unique value occurs in the ‘team’ column:
Note that if intervals are sorted by decreasing order of start times, we can quickly check if intervals overlap or not by comparing the start time of the previous interval with the end time of the current interval. Below is the implementation of the above algorithm.
Here, we separate count occurrences and combined count occurrences of the categorical columns present in a CSV file. The basic approach to use this method is to assign the column names as parameters in the groupby () method and then using the count () with it.
There are two main steps to this solution. First, you need to create a new column that you can group over. Simply using start_timeslot
won't work as rows in earlier timeslots that have not stopped charging can still affect later start_timeslot
groups.
New groups can be created using the start
and end
values together with date_range
. Note that extra care must be taken when the end value is in a new day.
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
df.loc[df['start'] > df['end'], 'end'] = df['end'] + pd.DateOffset(1)
df['hours'] = df.apply(lambda row: pd.date_range(row['start'].floor('H'), row['end'].floor('H'), freq='H').hour, axis=1)
Intermediate result:
start_timeslot start end power_kW vehicle_count hours
0 18H-19H 2021-11-18 18:00:00 2021-11-18 20:00:00 100 1 Int64Index([18, 19, 20], dtype='int64')
1 19H-20H 2021-11-18 19:00:00 2021-11-18 21:00:00 100 1 Int64Index([19, 20, 21], dtype='int64')
2 19H-20H 2021-11-18 19:00:00 2021-11-18 21:00:00 100 1 Int64Index([19, 20, 21], dtype='int64')
3 19H-20H 2021-11-18 19:00:00 2021-11-18 21:00:00 100 1 Int64Index([19, 20, 21], dtype='int64')
4 19H-20H 2021-11-18 19:00:00 2021-11-18 20:00:00 100 2 Int64Index([19, 20], dtype='int64')
5 19H-20H 2021-11-18 19:00:00 2021-11-18 22:00:00 100 1 Int64Index([19, 20, 21, 22], dtype='int64')
Now, we can explode
the new column and then group by it.
The second step is to compute the maximum required power for a group. We can simply take the start and end times and sort them. We consider the start time as positive (we add the required power) and end times as negative (subtract the power). Next, we sort by time and then use cumsum
on the series to get the cumulative sum and then take the maximum value of this. This will be the maximum required power for a group.
Code:
def func(df):
df['power'] = df['power_kW'] * df['vehicle_count']
pos = df.set_index('start')['power']
neg = df.set_index('end')['power'] * -1
df_merge = pd.concat([pos, neg]).reset_index()
maximum = df_merge.sort_values(['index', 'power'], ascending=[True, False])['power'].cumsum().max()
return maximum
res = df.explode('hours').groupby('hours').apply(func)
Note that if there is a car that stops charging at e.g. 10:35 and another one that starts at 10:35, this will be considered an overlap with the above code. If these should not overlap then change the sorting to use ascending=[True, True]
.
Result:
hours
0 600
18 100
19 700
20 3100
21 4700
22 4600
23 1800
dtype: int64
The start_timeslot
column is gone so some final postprocessing to obtain the expected result is required:
res = res.to_frame('needed_power_kW').reset_index().rename(columns={'hours': 'start_timeslot'})
res['end_hour'] = (pd.to_datetime(res['start_timeslot'], unit='h') + pd.DateOffset(hours=1)).dt.hour
res['start_timeslot'] = res['start_timeslot'].astype(str).str.zfill(2) + 'H-' + res['end_hour'].astype(str).str.zfill(2) + 'H'
res = res.drop('end_hour', axis=1)
Final result:
start_timeslot needed_power_kW
0 00H-01H 600
1 18H-19H 100
2 19H-20H 700
3 20H-21H 3100
4 21H-22H 4600
5 22H-23H 4600
6 23H-00H 1800
Using the additional data:
start_timeslot needed_power_kW
0 00H-01H 200.0
1 01H-02H 200.0
2 05H-06H 200.0
3 06H-07H 200.0
4 07H-08H 200.0
5 08H-09H 400.0
6 09H-10H 400.0
7 10H-11H 400.0
8 11H-12H 200.0
9 12H-13H 400.0
10 13H-14H 200.0
11 14H-15H 200.0
12 15H-16H 200.0
13 16H-17H 400.0
14 17H-18H 200.0
15 18H-19H 200.0
16 19H-20H 200.0
17 20H-21H 200.0
18 21H-22H 200.0
19 22H-23H 200.0
20 23H-00H 200.0
Note: by changing the sorting as described above, the needed_power_kW
for both 12H-13H
and 16H-17H
will be 200.
I would propose this approach.
First, change all the times to minutes per day. (If you have a larger dataset over days, you could adapt the code below using datetime
objects with the appropriate resolution, e.g. seconds.) That is done using the following function
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
def toMinutes(timeString):
t = [int(i) for i in timeString.split(':')]
t = 60 * t[0] + t[1]
return(t)
Next, I would iterate through your dataset noting the times where there were increases and decreases in usage along with the amount of power. You don't explicitly mention tracking the number of vehicles in your question but I am going to track that as well as power since they are pretty much the same from a counting point of view.
def add_row(timeString, power, nVehicle, arrive = True):
t = toMinutes(timeString)
delta = power * nVehicle
if arrive:
return([t / 60, delta, nVehicle])
else:
return([t / 60, -delta, - nVehicle])
Running through your dataframe
we can create a list of times where vehicles arrived or left and monitor the resulting change in power consumption. So each entry in your dataframe
results in two entries in the list below, one when the vehicle arrives and one when it leaves.
You stated in your question that a vehicle arriving at 20:50 and another leaving at 20:50 had to count as two vehicles at 20:50, so do all the arrivals first and then all the departures. That way when you sort below, it will count the arrivals before the departures.
x = [[0, 0, 0],] # initialize list (mostly for plotting, you can skip this if not plotting)
x += [add_row(t, P, n) \
for t, P, n in zip(df['start'], df['power_kW'], df['vehicle_count'])]
x += [add_row(t, P, n, arrive = False) \
for t, P, n in zip(df['end'], df['power_kW'], df['vehicle_count'])]
Now sort x
by the time column so we have an ordered time series of discrete events that we care about. The power consumption and number of vehicles is just the sum of past events in our dataset.
x = np.array(sorted(x))
t = x[:,0]
P = np.cumsum(x[:,1])
N = np.cumsum(x[:,2])
I like plotting things so here is a script to make sure the code is doing what we want so far.
def plot_events(t, P, n):
Nevent = len(P)
to = []
Po = []
No = []
for i in range(1, Nevent):
to.append(t[i])
to.append(t[i])
Po.append(P[i-1])
Po.append(P[i])
No.append(N[i-1])
No.append(N[i])
fig, ax = plt.subplots(figsize = (6,8), nrows=2)
ax[0].plot(to, Po)
ax[0].set_xlabel('Time in Hours')
ax[0].set_ylabel('Power in kW')
ax[1].plot(to, No)
ax[1].set_xlabel('Time in Hours')
ax[1].set_ylabel('Number of vehicles charging')
return(fig, ax)
fig, ax = plot_events(t, P, N)
Now to bin the events in time, we just look at the maximum demand in each time window of interest.
def bin_events(t, y, dt = 1, mint = 0, maxt = 24):
# find max value of y in t bins of size dt
x = np.arange(mint, maxt, dt)
z = np.zeros(x.shape)
for i in range(len(x)):
try:
z[i] = np.max(y[(t >= x[i]) & (t < x[i] + dt)])
except ValueError:
pass
return(z)
maxP = bin_events(t, P)
maxN = bin_events(t, N)
And again, plot to make sure we're doing what we want:
ax[0].bar(np.arange(0,24) + 0.5, maxP, color = 'k', alpha = 0.3, width = 1)
ax[1].bar(np.arange(0,24) + 0.5, maxN, color = 'k', alpha = 0.3, width = 1)
Finally, put the new info to a dataframe
or add it to your existing one:
col = 'start_timeslot'
choices = [f"{i:02}H-{(i+1)%24:02}H" for i in range(24)]
conditions = [eval(f"df[col] == '{c}'") for c in choices]
summary_data_df = pd.DataFrame(np.array([choices, maxP, maxN]).transpose(),
columns=['start_timeslot', 'max_power_kW', 'max_vehicles'])
df["max_power_kW"] = np.select(conditions, maxP, default=0)
df["max_vehicles"] = np.select(conditions, maxN, default=0)
start_timeslot max_power_kW max_vehicles
0 00H-01H 200.0 1.0
1 01H-02H 200.0 1.0
2 02H-03H 0.0 0.0
3 03H-04H 0.0 0.0
4 04H-05H 0.0 0.0
5 05H-06H 200.0 1.0
6 06H-07H 200.0 1.0
7 07H-08H 200.0 1.0
8 08H-09H 400.0 2.0
9 09H-10H 400.0 2.0
10 10H-11H 400.0 2.0
11 11H-12H 200.0 1.0
12 12H-13H 200.0 1.0
13 13H-14H 200.0 1.0
14 14H-15H 200.0 1.0
15 15H-16H 200.0 1.0
16 16H-17H 200.0 1.0
17 17H-18H 200.0 1.0
18 18H-19H 200.0 1.0
19 19H-20H 200.0 1.0
20 20H-21H 200.0 1.0
21 21H-22H 200.0 1.0
22 22H-23H 200.0 1.0
23 23H-00H 200.0 1.0
I have found an answer to your question. Code is below:
import pandas as pd
mycolumns = ['start_timeslot', 'start', 'end', 'power_kW', 'vehicle_count'] #Column header
data = [['18H-19H', '18:26', '20:27', 100, 1], ['19H-20H', '19:06', '21:16', 100, 1], #Data
['19H-20H', '19:19', '21:16', 100, 1], ['19H-20H', '19:36', '21:33', 100, 1],
['19H-20H', '19:54', '20:19', 100, 2], ['19H-20H', '19:55', '22:01', 100, 1],
['20H-21H', '20:04', '22:06', 100, 1], ['20H-21H', '20:11', '22:04', 100, 2],
['20H-21H', '20:15', '22:04', 100, 1], ['20H-21H', '20:16', '22:04', 100, 1],
['20H-21H', '20:17', '22:08', 100, 1], ['20H-21H', '20:19', '22:09', 100, 1],
['20H-21H', '20:20', '22:01', 100, 2], ['20H-21H', '20:22', '22:35', 100, 1],
['20H-21H', '20:28', '22:34', 100, 2], ['20H-21H', '20:29', '22:22', 100, 1],
['20H-21H', '20:30', '22:14', 100, 1], ['20H-21H', '20:31', '22:10', 100, 1],
['20H-21H', '20:37', '22:31', 100, 1], ['20H-21H', '20:41', '22:29', 100, 2],
['20H-21H', '20:43', '22:34', 100, 1], ['20H-21H', '20:46', '22:39', 100, 1],
['20H-21H', '20:47', '22:35', 100, 1], ['20H-21H', '20:52', '22:34', 100, 1],
['20H-21H', '20:52', '23:09', 100, 1], ['20H-21H', '20:53', '21:59', 100, 1],
['20H-21H', '20:54', '21:49', 100, 1], ['20H-21H', '20:56', '22:10', 100, 1],
['20H-21H', '20:56', '22:55', 100, 1], ['21H-22H', '21:03', '22:51', 100, 1],
['21H-22H', '21:05', '23:12', 100, 1], ['21H-22H', '21:08', '22:59', 100, 1],
['21H-22H', '21:10', '23:27', 100, 1], ['21H-22H', '21:10', '23:30', 100, 1],
['21H-22H', '21:15', '23:23', 100, 1], ['21H-22H', '21:19', '21:56', 100, 1],
['21H-22H', '21:21', '22:48', 100, 1], ['21H-22H', '21:25', '23:26', 100, 1],
['21H-22H', '21:25', '23:32', 100, 1], ['21H-22H', '21:27', '22:55', 100, 1],
['21H-22H', '21:27', '23:32', 100, 1], ['21H-22H', '21:33', '23:11', 100, 1],
['21H-22H', '21:37', '23:04', 100, 1], ['21H-22H', '21:39', '00:05', 100, 1],
['21H-22H', '21:40', '23:08', 100, 1], ['21H-22H', '21:45', '23:04', 100, 1],
['21H-22H', '21:49', '00:06', 100, 1], ['21H-22H', '21:54', '00:07', 100, 1],
['21H-22H', '21:58', '00:02', 100, 1], ['21H-22H', '21:58', '00:24', 100, 1],
['22H-23H', '22:10', '00:19', 100, 1]]
per_hour_columns = ['hour', 'power_kW'] #Header for individual vehicles
per_hour = [] #Empty data
x = 0
while(x < len(data)): #Loop through all data
y = int(data[x][1].split(':')[0])
while(y != int(data[x][2].split(':')[0])+1): #Loop though all hours the vehicle stayed
if(y >= 24):
y = 0
i = 0
while(i < data[x][4]): #Loop through number of vehicles
per_hour.append([str(y), data[x][3]]) #Append vehicle to array
i+=1
y+=1
x+=1
per_hour_df = pd.DataFrame(per_hour, columns=per_hour_columns) #Turn to dataframe
total_hours_header = ['time_slot', 'total_power_kW'] #header for total power per time slot
total_hours = []
i = 0
while(i < 24): #Loop through all hours
total = int(per_hour_df[per_hour_df['hour'] == str(i)].sum()[1]) #Get total for timeslot
total_hours.append([str(i) + "H-" + str(i+1) + "H", total]) #Append time slot to array
i+=1
total_hours_df = pd.DataFrame(total_hours, columns=total_hours_header) #Turn to dataframe
print(total_hours_df) #Print
I hope this works for you.
Output:
time_slot total_power_kW
0 0H-1H 600
1 1H-2H 0
2 2H-3H 0
3 3H-4H 0
4 4H-5H 0
5 5H-6H 0
6 6H-7H 0
7 7H-8H 0
8 8H-9H 0
9 9H-10H 0
10 10H-11H 0
11 11H-12H 0
12 12H-13H 0
13 13H-14H 0
14 14H-15H 0
15 15H-16H 0
16 16H-17H 0
17 17H-18H 0
18 18H-19H 100
19 19H-20H 700
20 20H-21H 3400
21 21H-22H 5200
22 22H-23H 4700
23 23H-24H 1800
To explain: I first create a new array which has one slot per vehicle (instead of having a vehicle count greater than 1). I also duplicate each vehicle for every hour is stayed. Next I looped through all hours (0-24) and appended the sum of all the elements with the current hour I am trying to find to a new array. This gives you an array which contains all 24 time slots with their corresponding power usage.
EDIT: I have changed the code slightly so that it records times that pass from hour 23 to 00
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With