Counting unique items in data frame

Tags:

r

I want a simple count of the number of subjects in each condition of a study. The data look something like this:

subjectid  cond   obser variable
1234        1        1      12   
1234        1        2      14
2143        2        1      19
3456        1        1      12 
3456        1        2      14 
3456        1        3      13   

etc       etc    etc       etc

This is a large dataset and it is not always obvious how many unique subjects contribute to each condition, etc.

I have this in a data.frame.

What I want is something like

cond   ofSs 
1       122 
2        98

Where for each "condition" I get a count of the number of unique Ss contributing data to that condition. Seems like this should be painfully simple.

529

asked Mar 28 '11 13:03

WGray

1 Answers

Use the ddply function from the plyr package:

require(plyr)
df <- data.frame(subjectid = sample(1:3,7,T), 
                 cond = sample(1:2,7,T), obser = sample(1:7))

> ddply(df, .(cond), summarize, NumSubs = length(unique(subjectid)))
  cond NumSubs
1    1       1
2    2       2

The ddply function "splits" the data-frame by the cond variable, and produces a summary column NumSubs for each sub-data-frame.

119

answered Oct 17 '22 07:10

Prasad Chalasani

Related questions
                            
                                R string removes punctuation on split
                            
                                Row product of matrix and column sum of matrix
                            
                                R load script objects to workspace
                            
                                Producing an animated comet plot in R
                            
                                Ordering Permutation in Rcpp i.e. base::order()
                            
                                Print r vector to copy paste into other code. [duplicate]
                            
                                Binning data in R
                            
                                What does mfrow & mfcol stand for in par()?
                            
                                How to create mean and s.d. columns in data.table
                            
                                Create frequency tables for multiple factor columns in R
                            
                                R Installing rCharts on R 3.4.2 x64
                            
                                Check if a string contains at least one numeric character in R [duplicate]
                            
                                R - Create a new variable where each observation depends on another table and other variables in the data frame
                            
                                RMySQL installation generating error in linux [closed]
                            
                                R dplyr:: rename and select using string variable
                            
                                R [ggplot2] How to set ticks size?
                            
                                Displaying geom_smooth() trend line from a specified x value
                            
                                Error: Can't add ggsave to a ggplot object
                            
                                dplyr tidyr – How to generate case_when with dynamic conditons?
                            
                                How to count number of Numeric values in a column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With