Python dataframe group labeling

Question

Is there any tool in Pandas or other python package to facilitate the labeling of groups of varying number rows in a Pandas’ dataFrame? Here are more detail:

My dataFrame has a column of datetime.datetime objects used as the dataFrame index. The time is not evenly spaced and during a given amount of time, the number of rows varies.

In that dataFrame, I need to make a column that has a different label for each series of rows that span over a constant time span (12 seconds in my case).

For example, using examples with seconds

Time(s) | label
1           |   0
2           |   0
3           |   0
7           |   0
12         |   1
15         |   1
20         |   1
24          |   2 
Etc…

I’m looking for an alternative as a brute-force loop. Algorithms somewhat related to that are labeling, segmenting, mapping but i'm not sure what to look for exactly. I was hoping there would be something exploiting python, Pandas and/or Numpy in a more efficient way since the number of rows is gonna be of the order of a hundred millions.

Thanks

MaxU - stop WAR against UA · Accepted Answer

Demo:

Sample DF:

In [32]: df = pd.DataFrame({'Time':pd.date_range('2017-01-01 00:00:01', freq='3S', periods=15)})

In [33]: df
Out[33]:
                  Time
0  2017-01-01 00:00:01
1  2017-01-01 00:00:04
2  2017-01-01 00:00:07
3  2017-01-01 00:00:10
4  2017-01-01 00:00:13
5  2017-01-01 00:00:16
6  2017-01-01 00:00:19
7  2017-01-01 00:00:22
8  2017-01-01 00:00:25
9  2017-01-01 00:00:28
10 2017-01-01 00:00:31
11 2017-01-01 00:00:34
12 2017-01-01 00:00:37
13 2017-01-01 00:00:40
14 2017-01-01 00:00:43

Solution:

In [34]: df['label'] = df.groupby(pd.Grouper(key='Time', freq='12S')).ngroup()

Result:

In [35]: df
Out[35]:
                  Time  label
0  2017-01-01 00:00:01      0
1  2017-01-01 00:00:04      0
2  2017-01-01 00:00:07      0
3  2017-01-01 00:00:10      0
4  2017-01-01 00:00:13      1
5  2017-01-01 00:00:16      1
6  2017-01-01 00:00:19      1
7  2017-01-01 00:00:22      1
8  2017-01-01 00:00:25      2
9  2017-01-01 00:00:28      2
10 2017-01-01 00:00:31      2
11 2017-01-01 00:00:34      2
12 2017-01-01 00:00:37      3
13 2017-01-01 00:00:40      3
14 2017-01-01 00:00:43      3

Python dataframe group labeling

Tags:

python

pandas

labeling

segment

Wall-E

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us

Python dataframe group labeling

Tags:

python

pandas

labeling

segment

Wall-E

1 Answers

MaxU - stop WAR against UA

Related questions

Recent Activity

Donate For Us