Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate (count) rows that match a condition, group by unique values

Tags:

It seems like such a simple problem, yet i've been pulling my hair out trying to get this to work:

Given this data frame identifying the interactions idhad with contact who is grouped by contactGrp,

head(data)
   id               sesTs  contact    contactGrp   relpos   maxpos
1 6849 2012-06-25 15:58:34   peter        west    0.000000      3
2 6849 2012-06-25 18:24:49   sarah        south   0.500000      3
3 6849 2012-06-27 00:13:30   sarah        south   1.000000      3
4 1235 2012-06-29 17:49:35   peter        west    0.000000      2
5 1235 2012-06-29 23:56:35   peter        west    1.000000      2
6 5893 2012-06-30 22:21:33   carl         east    0.000000      1

how many contacts where there for unique(data$contactGrp) with relpos=1 and maxpos>1 ?

An expected Result would be:

1 west   1
2 south  1
3 east   0

A small subset of lines i have tried:

  • aggregate(data, by=list('contactGrp'), FUN=count) yields an error, no filtering
  • using data.table seems to require a key, which is not unique in this data…
  • ddply(data,"contactGrp",summarise,count=???) not sure which function to use to fill the count column
  • ddply(subset(data,maxpos>1 & relpos==0), c('contactGrp'), function(df)count(df$relpos)) works but gives me an extra column x and it feels like i've overcomplicated it…

SQL would be easy: Select contactGrp, count(*) as cnt from data where … Group by contactGrp but im trying to learn R

like image 629
Lukas Grebe Avatar asked Jul 20 '12 13:07

Lukas Grebe


People also ask

Can we use aggregate function with GROUP BY clause?

The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.

Which aggregate function can be used with grouped data?

The most commonly used SQL aggregate functions include SUM, MAX, MIN, COUNT and AVERAGE. Aggregators are very often used in conjunction with Grouping functions in order to summarize the data.

How do I count the number of unique rows in R?

How to Count Distinct Values in R?, using the n_distinct() function from dplyr, you can count the number of distinct values in an R data frame using one of the following methods. With the given data frame, the following examples explain how to apply each of these approaches in practice.

How does GROUP BY work with aggregate functions?

The Group By statement is used to group together any rows of a column with the same value stored in them, based on a function specified in the statement. Generally, these functions are one of the aggregate functions such as MAX() and SUM(). This statement is used with the SELECT command in SQL.


1 Answers

And here is the data.table solution:

> library(data.table)
> dt <- data.table(sessions)
> dt[, length(contact[relpos == 0 & maxpos > 1]), by = contactGrp]
     contactGrp V1
[1,]       west  2
[2,]      south  0
[3,]       east  0

> dt[, length(contact[relpos == 1 & maxpos > 1]), by = contactGrp]
     contactGrp V1
[1,]       west  1
[2,]      south  1
[3,]       east  0
like image 188
Ryogi Avatar answered Sep 18 '22 15:09

Ryogi