I have a data frame like this:
date time userid status
1 02/25/2012 09:22:10 aabc logged_in
2 02/25/2012 09:30:10 aabc logged_out
3 02/25/2012 09:29:20 abbc logged_out
4 02/25/2012 09:27:30 abc logged_in
5 02/25/2012 09:26:29 abc login_failed
6 02/25/2012 09:26:39 abc login_failed
7 02/25/2012 09:26:52 abc login_failed
8 02/25/2012 09:27:09 abc login_failed
9 02/25/2012 09:27:20 abc login_failed
10 02/25/2012 09:24:10 abdc logged_in
11 02/25/2012 09:24:12 abdc logged_out
12 02/25/2012 09:22:10 abhc logged_in
13 02/25/2012 09:30:10 abuc logged_in
14 02/25/2012 09:30:14 abuc logged_out
15 02/25/2012 09:29:40 baa logged_in
I want to the userids, status and "count" of the login_failures for each userid. I did this:
ddply(mytbl, c('userid', 'status'), function(x) c(count=nrow(x)))
, but this gives the count for all userids. I want to restrict my output to only those userids whose status is 'login _failed'. Any ideas? I have seen questions on grouping by numeric fields, but, none on strings.
I am not very familiar with all the plyr features. It will great to see how this can be done using summarize, aggregate, sqldf, data.table etc. Slowly understanding each of them.
Thanks Sri
require(data.table)
DT = as.data.table(mytbl)
DT[status=="login_failed", .N, by=userid]
To name the column :
DT[status=="login_failed", list(failed_logins=.N), by=userid]
Slightly different approach than @Maiasaura. I filter to just the failed logins and then summarize. The difference would be whether those userid
's with logins, but no failed logins, appear in the final result with 0's or not.
ddply(mytbl[mytbl$status=="login_failed",], .(userid),
summarise, failed_logins=length(status))
This gives
> ddply(mytbl[mytbl$status=="login_failed",], .(userid),
+ summarise, failed_logins=length(status))
userid failed_logins
1 abc 5
To complete the approaches, if you want all the userid
's:
ddply(mytbl, .(userid),
summarise, failed_logins = sum(status=="login_failed"))
which gives
> ddply(mytbl, .(userid),
+ summarise, failed_logins = sum(status=="login_failed"))
userid failed_logins
1 aabc 0
2 abbc 0
3 abc 5
4 abdc 0
5 abhc 0
6 abuc 0
7 baa 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With