Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get number of rows for a specific value in a column

Tags:

r

I am trying to get the number of rows for a specific column. I have three columns with Name, Age, and major. How can i find out how many BIO majors there are for example from this list.

I have a DF <- (NAME, YEAR, MAJOR, GPA) I want to have a function so I can eliminate any major with less than 20 people.

so I want something like this, but in actual r code.

DF <- function(x){
##  Y <- get number of people for each major
##  GPA [DF$Y < 20] <- NA

Any help would be appreciated

like image 389
user2446334 Avatar asked Jun 18 '13 19:06

user2446334


People also ask

How do you count the number of occurrences of a value in a column?

Use the COUNTIF function to count how many times a particular value appears in a range of cells.


1 Answers

I think the two methods offered so far are overly complex. Try either of these, the second of which is obviously the "Right way". :-) (Borrowing @gung's example.)

#  1
> tapply( DF$MAJOR, DF$MAJOR, length)
 BIO ECON HIST  LIT MATH 
 181  155  297  303   64 

#  2
> table(DF$MAJOR)

 BIO ECON HIST  LIT

 MATH 
     181  155  297  303   64 

And as far as efficiency?

> system.time( {dt = data.table(DF)
+  foo <- dt[,.N,by=MAJOR] })
   user  system elapsed 
  1.384   0.027   1.417 
> system.time(foo<- table(DF$MAJOR) )
   user  system elapsed 
  0.110   0.025   0.134 
#edit:
> system.time( {dt = as.data.table(DF)
+  foo <- dt[,.N,by=MAJOR] })
   user  system elapsed 
  0.064   0.022   0.086 

The answer the appended question in the comments of how to associate a tabular result with each student record, look at the ave function and use the first method with either "["-extraction or with subset:

 DF$group.size <- ave(DF$MAJOR, DF$MAJOR, length)
 newDF <- DF[ DF$group.size >=20000 , ]
like image 75
IRTFM Avatar answered Sep 21 '22 07:09

IRTFM