I have the following Pandas dataframe:
data = {
'ID': [100, 100, 100, 100, 200, 200, 200, 200, 200, 300, 300, 300, 300, 300],
'value': [False, False, True, False, False, True, True, True, False, False, False, True, True, False],
}
df = pandas.DataFrame (data, columns = ['ID','value'])
I want to get the following groups:
Can this be done with pandas?
I've tried
group = df.groupby((df['value'].shift() != df['value']).cumsum())
but this returns an incorrect result.
To count unique values per groups in Python Pandas, we can use df. groupby('column_name'). count().
groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names.
And you can use the following syntax to select unique rows across specific columns in a pandas DataFrame: df = df. drop_duplicates(subset=['col1', 'col2', ...])
Let us try shift
+ cumsum
create the groupby key: BTW I really like the way you display your expected output
s = df.groupby('ID')['value'].apply(lambda x : x.ne(x.shift()).cumsum())
d = {x : y for x ,y in df.groupby(s)}
d[2]
ID value
2 100 True
5 200 True
6 200 True
7 200 True
11 300 True
12 300 True
d[1]
ID value
0 100 False
1 100 False
4 200 False
9 300 False
10 300 False
d[3]
ID value
3 100 False
8 200 False
13 300 False
Let's try following your logic:
# 1. all False up to first True
group1 = df.loc[df.groupby('ID')['value'].cumsum() == 0]
# 2. all False after last True
group2 = df.loc[df.iloc[::-1].groupby('ID')['value'].cumsum()==0]
# 3. all True
group3 = df[df['value']]
Output:
ID value
0 100 False
1 100 False
4 200 False
9 300 False
10 300 False
ID value
3 100 False
8 200 False
13 300 False
ID value
2 100 True
5 200 True
6 200 True
7 200 True
11 300 True
12 300 True
This works for your example data
df['groups'] = df.groupby('ID').value.apply(lambda x: x.diff().ne(False).cumsum()).astype('int')
for _,df_groups in df.groupby('groups'):
print(df_groups)
print('-'*20)
Out:
ID value groups
0 100 False 1
1 100 False 1
4 200 False 1
9 300 False 1
10 300 False 1
--------------------
ID value groups
2 100 True 2
5 200 True 2
6 200 True 2
7 200 True 2
11 300 True 2
12 300 True 2
--------------------
ID value groups
3 100 False 3
8 200 False 3
13 300 False 3
--------------------
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With