Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add an index (or counter) to a dataframe by group in R [duplicate]

I have a df like

ProjectID Dist
  1        x
  1        y
  2        z
  2        x
  2        h
  3        k
  ....     ....

I want to add a third column such that we have an incrementing counter for each ProjectID:

ProjectID Dist counter
  1        x     1
  1        y     2
  2        z     1
  2        x     2
  2        h     3
  1        k     3
  ....     ....

I've had a look at seq rank and a couple of other bits particularly looking to see if I could use ddply to help:

df$counter <- ddply(df,.(projectID), function(x).....? )

I think I could adapt this answer How to create a counter/numeration by group? but would prefer something using something like ddply (I can't find an equivalent of cumsum but I think that's the same principle here: Create ascending series of integers by group in Pandas ). That'd let me index occurrences in a list (and e.g. merge on this).

like image 574
sjgknight Avatar asked Feb 21 '15 16:02

sjgknight


People also ask

How do you add sequential numbers in R?

The simplest way to create a sequence of numbers in R is by using the : operator. Type 1:20 to see how it works. That gave us every integer between (and including) 1 and 20 (an integer is a positive or negative counting number, including 0).

How do I add row numbers to a Dataframe in R?

Adding row number using base RFirst we create a variable containing row numbers. Here we use seq() function to create a vector containing sequence of numbers. It is of the same size as the number of rows in the dataframe. Then we can add the vector to the dataframe using $ symbol.

How do I generate row numbers in R?

To Generate Row number to the dataframe in R we will be using seq.int() function. Seq.int() function along with nrow() is used to generate row number to the dataframe in R. We can also use row_number() function to generate row index.


1 Answers

A dplyr solution is quite simple:

library(dplyr)

df %>% group_by(ProjectID) %>% mutate(counter = row_number(ProjectID))


#  ProjectID Dist counter
#1         1    x       1
#2         1    y       2
#3         2    z       1
#4         2    x       2
#5         2    h       3
#6         1    k       3
like image 106
jalapic Avatar answered Sep 29 '22 16:09

jalapic