I am trying to get the number of rows for a specific column. I have three columns with Name, Age, and major. How can i find out how many BIO majors there are for example from this list.
I have a DF <- (NAME, YEAR, MAJOR, GPA) I want to have a function so I can eliminate any major with less than 20 people.
so I want something like this, but in actual r code.
DF <- function(x){
## Y <- get number of people for each major
## GPA [DF$Y < 20] <- NA
Any help would be appreciated
Use the COUNTIF function to count how many times a particular value appears in a range of cells.
I think the two methods offered so far are overly complex. Try either of these, the second of which is obviously the "Right way". :-) (Borrowing @gung's example.)
# 1
> tapply( DF$MAJOR, DF$MAJOR, length)
BIO ECON HIST LIT MATH
181 155 297 303 64
# 2
> table(DF$MAJOR)
BIO ECON HIST LIT
MATH
181 155 297 303 64
And as far as efficiency?
> system.time( {dt = data.table(DF)
+ foo <- dt[,.N,by=MAJOR] })
user system elapsed
1.384 0.027 1.417
> system.time(foo<- table(DF$MAJOR) )
user system elapsed
0.110 0.025 0.134
#edit:
> system.time( {dt = as.data.table(DF)
+ foo <- dt[,.N,by=MAJOR] })
user system elapsed
0.064 0.022 0.086
The answer the appended question in the comments of how to associate a tabular result with each student record, look at the ave
function and use the first method with either "["-extraction or with subset
:
DF$group.size <- ave(DF$MAJOR, DF$MAJOR, length)
newDF <- DF[ DF$group.size >=20000 , ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With