I have a data.frame <code>mydf</code> with about 2500 rows. These rows correspond to 69 classes of objects in colum 1 <code>mydf$V1</code>, and I want to count how many rows per object class I have. I can get a factor of these classes with: <pre class="prettyprint"><code>objectclasses = unique(factor(mydf$V1, exclude="1")); </code></pre> What's the terse R way to count the rows per object class? If this were any other language I'd be traversing an array with a loop and keeping count but I'm new to R programming and am trying to take advantage of R's vectorised operations.

Or using the <code>dplyr</code> library: <pre class="prettyprint"><code>library(dplyr) set.seed(1) dat <- data.frame(ID = sample(letters,100,rep=TRUE)) dat %>% group_by(ID) %>% summarise(no_rows = length(ID)) </code></pre> Note the use of <code>%>%</code>, which is similar to the use of pipes in bash. Effectively, the code above pipes <code>dat</code> into <code>group_by</code>, and the result of that operation is piped into <code>summarise</code>. The result is: <pre class="prettyprint"><code>Source: local data frame [26 x 2] ID no_rows 1 a 2 2 b 3 3 c 3 4 d 3 5 e 2 6 f 4 7 g 6 8 h 1 9 i 6 10 j 5 11 k 6 12 l 4 13 m 7 14 n 2 15 o 2 16 p 2 17 q 5 18 r 4 19 s 5 20 t 3 21 u 8 22 v 4 23 w 5 24 x 4 25 y 3 26 z 1 </code></pre> See the <code>dplyr</code> introduction for some more context, and the documentation for details regarding the individual functions.

Using <code>plyr</code> package: <pre class="prettyprint"><code>library(plyr) count(mydf$V1) </code></pre> It will return you a frequency of each value.

Using <code>data.table</code> <pre class="prettyprint"><code> library(data.table) setDT(dat)[, .N, keyby=ID] #(Using @Paul Hiemstra's `dat`) </code></pre> Or using <code>dplyr 0.3</code> <pre class="prettyprint"><code> res <- count(dat, ID) head(res) #Source: local data frame [6 x 2] # ID n #1 a 2 #2 b 3 #3 c 3 #4 d 3 #5 e 2 #6 f 4 </code></pre> Or <pre class="prettyprint"><code> dat %>% group_by(ID) %>% tally() </code></pre> Or <pre class="prettyprint"><code> dat %>% group_by(ID) %>% summarise(n=n()) </code></pre>

We can use <code>summary</code> on factor column: <pre class="prettyprint"><code>summary(myDF$factorColumn) </code></pre>

One more approach would be to apply n() function which is counting the number of observations <pre class="prettyprint"><code>library(dplyr) library(magrittr) data %>% group_by(columnName) %>% summarise(Count = n()) </code></pre>

In case I just want to know how many unique factor levels exist in the data, I use: <pre class="prettyprint"><code>length(unique(df$factorcolumn)) </code></pre>

How to count how many values per level in a given factor?

Tags:

r

count

frequency

I have a data.frame mydf with about 2500 rows. These rows correspond to 69 classes of objects in colum 1 mydf$V1, and I want to count how many rows per object class I have. I can get a factor of these classes with:

objectclasses = unique(factor(mydf$V1, exclude="1"));

What's the terse R way to count the rows per object class? If this were any other language I'd be traversing an array with a loop and keeping count but I'm new to R programming and am trying to take advantage of R's vectorised operations.

573

asked Sep 30 '14 06:09

Escher

8 Answers

Or using the dplyr library:

library(dplyr)
set.seed(1)
dat <- data.frame(ID = sample(letters,100,rep=TRUE))
dat %>% 
  group_by(ID) %>%
  summarise(no_rows = length(ID))

Note the use of %>%, which is similar to the use of pipes in bash. Effectively, the code above pipes dat into group_by, and the result of that operation is piped into summarise.

The result is:

Source: local data frame [26 x 2]

   ID no_rows
1   a       2
2   b       3
3   c       3
4   d       3
5   e       2
6   f       4
7   g       6
8   h       1
9   i       6
10  j       5
11  k       6
12  l       4
13  m       7
14  n       2
15  o       2
16  p       2
17  q       5
18  r       4
19  s       5
20  t       3
21  u       8
22  v       4
23  w       5
24  x       4
25  y       3
26  z       1

See the dplyr introduction for some more context, and the documentation for details regarding the individual functions.

answered Oct 04 '22 07:10

Paul Hiemstra

Here 2 ways to do it:

set.seed(1)
tt <- sample(letters,100,rep=TRUE)

## using table
table(tt)
tt
a b c d e f g h i j k l m n o p q r s t u v w x y z 
2 3 3 3 2 4 6 1 6 5 6 4 7 2 2 2 5 4 5 3 8 4 5 4 3 1 
## using tapply
tapply(tt,tt,length)
a b c d e f g h i j k l m n o p q r s t u v w x y z 
2 3 3 3 2 4 6 1 6 5 6 4 7 2 2 2 5 4 5 3 8 4 5 4 3 1

answered Oct 04 '22 05:10

agstudy

Using plyr package:

library(plyr)

count(mydf$V1)

It will return you a frequency of each value.

answered Oct 04 '22 06:10

Andriy T.

Using data.table

 library(data.table)
 setDT(dat)[, .N, keyby=ID] #(Using @Paul Hiemstra's `dat`)

Or using dplyr 0.3

 res <- count(dat, ID)
 head(res)
 #Source: local data frame [6 x 2]

 #  ID n
 #1  a 2
 #2  b 3
 #3  c 3
 #4  d 3
 #5  e 2
 #6  f 4

  dat %>% 
      group_by(ID) %>% 
      tally()

  dat %>% 
      group_by(ID) %>%
      summarise(n=n())

answered Oct 04 '22 05:10

akrun

We can use summary on factor column:

summary(myDF$factorColumn)

answered Oct 04 '22 06:10

Spariant

One more approach would be to apply n() function which is counting the number of observations

library(dplyr)
library(magrittr)
data %>% 
  group_by(columnName) %>%
  summarise(Count = n())

answered Oct 04 '22 06:10

iamigham

In case I just want to know how many unique factor levels exist in the data, I use:

length(unique(df$factorcolumn))

answered Oct 04 '22 06:10

Peter

Use the package plyr with lapply to get frequencies for every value (level) and every variable (factor) in your data frame.

library(plyr)
lapply(df, count)

answered Oct 04 '22 06:10

Christian Savemark

Related questions
                            
                                Non-redundant version of expand.grid
                            
                                Generating Random Dates
                            
                                Remove parenthesis from a character string
                            
                                convert date to unix time in R
                            
                                Saving leaflet output as html
                            
                                Overlap join with start and end positions
                            
                                R, dplyr - combination of group_by() and arrange() does not produce expected result?
                            
                                How to define fixed aspect-ratio for scatter-plot
                            
                                Multirow axis labels with nested grouping variables
                            
                                Why do vector indices in R start with 1, instead of 0? [closed]
                            
                                Check whether values in one data frame column exist in a second data frame
                            
                                Warning in install.packages : installation of package ‘tidyverse’ had non-zero exit status
                            
                                Case-insensitive search of a list in R
                            
                                How to Convert data frame to spatial coordinates
                            
                                Sending email in R via outlook [closed]
                            
                                Remove 'search' option but leave 'search columns' option
                            
                                Create a sequence between two letters
                            
                                Change the color of action button in shiny
                            
                                Plotting multiple time series on the same plot using ggplot()
                            
                                Rounding selected columns of data.table in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to count how many values per level in a given factor?

Tags:

r

count

frequency

Escher

People also ask

8 Answers

Paul Hiemstra

agstudy

Andriy T.

akrun

Spariant

iamigham

Peter

Christian Savemark

Recent Activity

Donate For Us