Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Frequency counts in R [duplicate]

This may seem like a very basic R question, but I'd appreciate an answer. I have a data frame in the form of:

col1    col2
a   g
a   h
a   g
b   i
b   g
b   h
c   i

I want to transform it into counts, so the outcome would be like this. I've tried using table () function, but seem to only be able to get the count for one column.

    a   b   c
g   2   1   0
h   1   1   0
i   0   1   1

How do I do it in R?

like image 466
aa762 Avatar asked Sep 19 '13 12:09

aa762


People also ask

How do I count repeated data in R?

Use the length() function to count the number of elements returned by the which() function, as which function returns the elements that are repeated more than once. The length() function in R Language is used to get or set the length of a vector (list) or other objects.

How do you find the frequency count in R?

There are multiple ways to get the count of the frequency of all unique values in an R vector. To count the number of times each element or value is present in a vector use either table(), tabulate(), count() from plyr package, or aggregate() function.

How does R handle duplicates?

R base provides duplicated() and unique() functions to remove duplicates in an R DataFrame (data. frame), By using these two functions we can delete duplicate rows by considering all columns, single column, or selected columns.


2 Answers

I'm not really sure what you used, but table works fine for me!

Here's a minimal reproducible example:

df <- structure(list(V1 = c("a", "a", "a", "b", "b", "b", "c"), 
                     V2 = c("g", "h", "g", "i", "g", "h", "i")), 
                .Names = c("V1", "V2"), class = "data.frame", 
                row.names = c(NA, -7L))
table(df)
#    V2
# V1  g h i
#   a 2 1 0
#   b 1 1 1
#   c 0 0 1

Notes:

  • Try table(df[c(2, 1)]) (or table(df$V2, df$V1)) to swap the rows and columns.
  • Use as.data.frame.matrix(table(df)) to get a data.frame as your output. (as.data.frame will create a long data.frame, not one in the same output format you desire).
like image 99
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 23 '22 23:09

A5C1D2H2I1M1N2O1R2T1


Using f from @Ananda you can use dcast

library(reshape2)

> dcast(f, V1~V2)
Using V2 as value column: use value.var to override.
Aggregation function missing: defaulting to length
  V1  g  h  i
1 a   2  1  0
2 b   1  1  1
3 c   0  0  1

However, I'm writing this only in case you may need something more than just table (which for this case it's the simplest correct answer) in the future, like:

set.seed(1)
f$var <- rnorm(7)

> f
  V1 V2        var
1 a   g -0.6264538
2 a   h  0.1836433
3 a   g -0.8356286
4 b   i  1.5952808
5 b   g  0.3295078
6 b   h -0.8204684
7 c   i  0.4874291

> dcast(f, V1~V2, value.var="var", fun.aggregate=sum)
  V1          g          h         i
1 a  -1.4620824  0.1836433 0.0000000
2 b   0.3295078 -0.8204684 1.5952808
3 c   0.0000000  0.0000000 0.4874291
like image 23
Michele Avatar answered Sep 25 '22 23:09

Michele