Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count occurrences of value in a set of variables in R (per row)

Let's say I have a data frame with 10 numeric variables V1-V10 (columns) and multiple rows (cases).

What I would like R to do is: For each case, give me the number of occurrences of a certain value in a set of variables.

For example the number of occurrences of the numeric value 99 in that single row for V2, V3, V6, which obviously has a minimum of 0 (none of the three have the value 99) and a maximum of 3 (all of the three have the value 99).

I am really looking for an equivalent to the SPSS function COUNT: "COUNT creates a numeric variable that, for each case, counts the occurrences of the same value (or list of values) across a list of variables."

I thought about table() and library plyr's count(), but I cannot really figure it out. Vectorized computation preferred. Thanks a lot!

like image 333
nilsole Avatar asked Jun 03 '14 12:06

nilsole


People also ask

How do you count how many times a value is repeated in R?

Use the length() function to count the number of elements returned by the which() function, as which function returns the elements that are repeated more than once. The length() function in R Language is used to get or set the length of a vector (list) or other objects.

How do I count unique values in a row in R?

For example, if we have a data frame called df that contains multiple columns then the number of unique values in each row of df can be found by using the command apply(df,1,function(x) length(unique(x))).

How do I count variables in group by in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) .


2 Answers

I think that there ought to be a simpler way to do this, but the best way that I can think of to get a table of counts is to loop (implicitly using sapply) over the unique values in the dataframe.

#Some example data
df <- data.frame(a=c(1,1,2,2,3,9),b=c(1,2,3,2,3,1))
df
#  a b
#1 1 1
#2 1 2
#3 2 3
#4 2 2
#5 3 3
#6 9 1

levels=unique(do.call(c,df)) #all unique values in df
out <- sapply(levels,function(x)rowSums(df==x)) #count occurrences of x in each row
colnames(out) <- levels
out
#     1 2 3 9
#[1,] 2 0 0 0
#[2,] 1 1 0 0
#[3,] 0 1 1 0
#[4,] 0 2 0 0
#[5,] 0 0 2 0
#[6,] 1 0 0 1
like image 195
Miff Avatar answered Sep 29 '22 09:09

Miff


If you need to count any particular word/letter in the row.

#Let df be a data frame with four variables (V1-V4)
             df <- data.frame(V1=c(1,1,2,1,L),V2=c(1,L,2,2,L),
             V3=c(1,2,2,1,L), V4=c(L, L, 1,2, L))

For counting number of L in each row just use

#This is how to compute a new variable counting occurences of "L" in V1-V4.      
df$count.L <- apply(df, 1, function(x) length(which(x=="L")))

The result will appear like this

> df
  V1 V2 V3 V4 count.L
1  1  1  1 L       1
2  1  L  2 L       2
3  2  2  2  1      0
4  1  2  1  2      0
like image 35
Adi.sr Avatar answered Sep 29 '22 09:09

Adi.sr