Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr calculate a new column by applying summarise function on another dataframe

Tags:

r

dplyr

I want to create a new column (CNT) in a dataframe called df. The value will be calculated using summarise function from dplyr package. It should return a number since I need to count a column in another dataframe (=cars), however the conditions for filtration is determined by the values in 2 columns of df.

dataframe:

library(dplyr)
df <- data.frame("my_speed" = 11:20, "my_dist" = c(17,20,15,17,21,23,28,36,50,80))

As an example, this is the calculation for the first row of df.

x=df[1,1]
y=df[1,2]

cars %>% 
group_by(speed) %>% 
filter(speed==x & dist==y) %>% 
summarise(count=n()) %>% 
select (count)

I am trying to figure out how I can use summarise() or another method to do this easily. NOTE that if summarise() returns no records, we should show zero.

df %>% 
rowwise() %>%
filter(speed==my_spped & dist==my_dist) %>% 
summarise(count=n()) %>% 
select (count) %>% 
mutate(CNT=count)
like image 224
Ibo Avatar asked Oct 19 '25 07:10

Ibo


1 Answers

With rowwise, we can get the sum of the logical expression directly instead of doing additional operations

df %>% 
   rowwise %>% 
   mutate(CNT = sum((cars$speed == my_speed) & (cars$dist == my_dist)))
# A tibble: 10 x 3
#   my_speed my_dist   CNT
#      <int>   <dbl> <int>
# 1       11      17     1
# 2       12      20     1
# 3       13      15     0
# 4       14      17     0
# 5       15      21     0
# 6       16      23     0
# 7       17      28     0
# 8       18      36     0
# 9       19      50     0
#10       20      80     0
like image 157
akrun Avatar answered Oct 20 '25 21:10

akrun