Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Increase counter by 1 for each unique group of values

Tags:

r

I want to create a continually increasing counter for each group, where each group is a unique combination of person and day.

This is what the data looks like:

> df
  person      date
1      0    monday
2      0   tuesday
3      1    monday
4      1    monday
5      1   tuesday
6      2    monday
7      2    monday
8      2   tuesday
9      2 wednesday

Thus, I want to add a new variable starts at 1, and adds for for each new combination of person and day.

> df
  person      date counter
1      0    monday       1
2      0   tuesday       2
3      1    monday       3
4      1    monday       3
5      1   tuesday       4
6      2    monday       5
7      2    monday       5
8      2   tuesday       6
9      2 wednesday       7

I hope that the data is clear enough. The counter continues until it reaches the end of the data set.

like image 219
Boudewijn Aasman Avatar asked Jul 22 '15 20:07

Boudewijn Aasman


People also ask

How do you count unique values in Groupby?

To count unique values per groups in Python Pandas, we can use df. groupby('column_name'). count().

How do you count unique values in pandas Groupby?

Method 1: Count unique values using nunique() The Pandas dataframe. nunique() function returns a series with the specified axis's total number of unique observations. The total number of distinct observations over the index axis is discovered if we set the value of the axis to 0.

How do you count unique records of a DataFrame?

Using SQL Count Distinct distinct() runs distinct on all columns, if you want to get count distinct on selected columns, use the Spark SQL function countDistinct() . This function returns the number of distinct elements in a group.


2 Answers

You can use rleid from the devel version of data.table. Instructions to install the devel version are here

 library(data.table)#v.9.5+
 setDT(df)[, counter:= rleid(date)][]
 #    person      date counter
 # 1:      0    monday       1
 # 2:      0   tuesday       2
 # 3:      1    monday       3
 # 4:      1    monday       3
 # 5:      1   tuesday       4
 # 6:      2    monday       5
 # 7:      2    monday       5
 # 8:      2   tuesday       6
 # 9:      2 wednesday       7

Or

library(dplyr)
df %>%  
   mutate(counter= cumsum(date!=lag(date, default=FALSE)))
like image 106
akrun Avatar answered Nov 02 '22 01:11

akrun


Base package:

df1 <- data.frame(unique(df), counter= 1:nrow(unique(df)))
merge(df, df1)

Output:

  person      date counter
1      0    monday       1
2      0   tuesday       2
3      1    monday       3
4      1    monday       3
5      1   tuesday       4
6      2    monday       5
7      2    monday       5
8      2   tuesday       6
9      2 wednesday       7
like image 2
mpalanco Avatar answered Nov 01 '22 23:11

mpalanco