I want to create a cumulative counter of the number of times each value appears. e.g. say I have the column: <pre class="prettyprint"><code>id 1 2 3 2 2 1 2 3 </code></pre> This would become: <pre class="prettyprint"><code>id count 1 1 2 1 3 1 2 2 2 3 1 2 2 4 3 2 </code></pre> etc...

The <code>ave</code> function computes a function by group. <pre class="prettyprint"><code>> id <- c(1,2,3,2,2,1,2,3) > data.frame(id,count=ave(id==id, id, FUN=cumsum)) id count 1 1 1 2 2 1 3 3 1 4 2 2 5 2 3 6 1 2 7 2 4 8 3 2 </code></pre> I use <code>id==id</code> to create a vector of all <code>TRUE</code> values, which get converted to numeric when passed to <code>cumsum</code>. You could replace <code>id==id</code> with <code>rep(1,length(id))</code>.

Here is a way to get the counts: <pre class="prettyprint"><code>id <- c(1,2,3,2,2,1,2,3) sapply(1:length(id),function(i)sum(id[i]==id[1:i])) </code></pre> Which gives you: <pre class="prettyprint"><code>[1] 1 1 1 2 3 2 4 2 </code></pre>

The <code>dplyr</code> way: <pre class="prettyprint"><code>library(dplyr) foo <- data.frame(id=c(1, 2, 3, 2, 2, 1, 2, 3)) foo <- foo %>% group_by(id) %>% mutate(count=row_number()) foo # A tibble: 8 x 2 # Groups: id [3] id count <dbl> <int> 1 1 1 2 2 1 3 3 1 4 2 2 5 2 3 6 1 2 7 2 4 8 3 2 </code></pre> That ends up grouped by <code>id</code>. If you want it not grouped, add <code>%>% ungroup()</code>.

For completeness, adding a data.table way: <pre class="prettyprint"><code>library(data.table) DT <- data.table(id = c(1, 2, 3, 2, 2, 1, 2, 3)) DT[, count := seq(.N), by = id][] </code></pre> Output: <pre class="prettyprint"><code> id count 1: 1 1 2: 2 1 3: 3 1 4: 2 2 5: 2 3 6: 1 2 7: 2 4 8: 3 2 </code></pre>

Cumulative count of each value [duplicate]

Tags:

r

count

cumulative-sum

running-count

I want to create a cumulative counter of the number of times each value appears.

e.g. say I have the column:

This would become:

etc...

792

asked Apr 05 '12 13:04

user1165199

4 Answers

The ave function computes a function by group.

> id <- c(1,2,3,2,2,1,2,3)
> data.frame(id,count=ave(id==id, id, FUN=cumsum))
  id count
1  1     1
2  2     1
3  3     1
4  2     2
5  2     3
6  1     2
7  2     4
8  3     2

I use id==id to create a vector of all TRUE values, which get converted to numeric when passed to cumsum. You could replace id==id with rep(1,length(id)).

102

answered Oct 18 '22 13:10

Joshua Ulrich

Here is a way to get the counts:

id <- c(1,2,3,2,2,1,2,3)

sapply(1:length(id),function(i)sum(id[i]==id[1:i]))

Which gives you:

[1] 1 1 1 2 3 2 4 2

answered Oct 18 '22 13:10

Sacha Epskamp

The dplyr way:

library(dplyr)

foo <- data.frame(id=c(1, 2, 3, 2, 2, 1, 2, 3))
foo <- foo %>% group_by(id) %>% mutate(count=row_number())
foo

# A tibble: 8 x 2
# Groups:   id [3]
     id count
  <dbl> <int>
1     1     1
2     2     1
3     3     1
4     2     2
5     2     3
6     1     2
7     2     4
8     3     2

That ends up grouped by id. If you want it not grouped, add %>% ungroup().

answered Oct 18 '22 14:10

dfrankow

For completeness, adding a data.table way:

library(data.table)

DT <- data.table(id = c(1, 2, 3, 2, 2, 1, 2, 3))

DT[, count := seq(.N), by = id][]

Output:

   id count
1:  1     1
2:  2     1
3:  3     1
4:  2     2
5:  2     3
6:  1     2
7:  2     4
8:  3     2

answered Oct 18 '22 13:10

Jens Adamczak

Related questions
                            
                                rm(list=ls()) doesn't completely clear the workspace
                            
                                ggplot2 : Adding two errorbars to each point in scatterplot
                            
                                R equivalent of python "_"?
                            
                                display a matrix, including the values, as a heatmap
                            
                                Using dynamic column names in `data.table`
                            
                                How to tell what packages you have used in R
                            
                                large amount of data in many text files - how to process?
                            
                                Write list of data.frames to separate CSV files with lapply
                            
                                Using the %>% pipe, and dot (.) notation
                            
                                How do I find the edges of a vertex using igraph and R?
                            
                                Position legend in first plot of facet
                            
                                Make conditionalPanel depend on files uploaded with fileInput
                            
                                Use different center than the prime meridian in plotting a world map
                            
                                r cumsum per group in dplyr
                            
                                How to extend `==` behavior to vectors that include NAs?
                            
                                Naming list elements in R
                            
                                R: How can I install a specific release by install_github()?
                            
                                Running R Scripts with Plots
                            
                                How to combine row and column layout in flexdashboard?
                            
                                Adding text to a grid.table plot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cumulative count of each value [duplicate]

Tags:

r

count

cumulative-sum

running-count

user1165199

People also ask

4 Answers

Joshua Ulrich

Sacha Epskamp

dfrankow

Jens Adamczak

Recent Activity

Donate For Us