Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R group by | count distinct values grouping by another column

Tags:

r

How can I count the number of distinct visit_ids per pagename?

visit_id  post_pagename
1       A
1       B
1       C
1       D 
2       A
2       A
3       A
3       B

Result should be:

post_pagename distinct_visit_ids
A     3
B     2
C     1
D     1

tried it with

test_df<-data.frame(cbind(c(1,1,1,1,2,2,3,3),c("A","B","C","D","A","A","A","B")))
colnames(test_df)<-c("visit_id","post_pagename")
test_df

test_df %>%
 group_by(post_pagename) %>%
  summarize(vis_count = n_distinct(visit_id))

But this gives me only the amount of distinct visit_id in my data set

like image 671
flobrr Avatar asked Jan 02 '23 05:01

flobrr


1 Answers

One way

test_df |>
  distinct() |>
  count(post_pagename)

#   post_pagename     n
#   <fct>         <int>
# 1 A                 3
# 2 B                 2
# 3 C                 1
# 4 D                 1

Or another

test_df |>
  group_by(post_pagename) |>
  summarise(distinct_visit_ids = n_distinct(visit_id))

# A tibble: 4 x 2
#  post_pagename distinct_visit_ids
#  <fct>                      <int>
#1 A                              3
#2 B                              2
#3 C                              1
#4 D                              1

*D has one visit, so it must be counted*
like image 128
utubun Avatar answered Feb 19 '23 08:02

utubun