Get count of group-level observations with multiple individual observations from dataframe in R

Question

How do I get a dataframe like this:

soccer_player country position
"sam"         USA     left defender
"jon"         USA     right defender
"sam"         USA     left midfielder
"jon"         USA     offender
"bob"         England goalie
"julie"       England central midfielder
"jane"        England goalie

To look like this (country with the counts of unique players per country):

country player_count
USA     2
England 3

The obvious complication is that there are multiple observations per player, so I cannot simply do table(df$country) to get the number of observations per country.

I have been playing with the table() and merge() functions but have not had any luck.

Matthew Plourde · Accepted Answer

Here's one way:

as.data.frame(table(unique(d[-3])$country))
#      Var1 Freq
# 1 England    3
# 2     USA    2

Drop the third column, remove any duplicate Country-Name pairs, then count the occurrences of each country.

Ben Bolker · Answer

The new features of dplyr v 3.0 provide a compact solution:

Data:

dd <- read.csv(text='
soccer_player,country,position
"sam",USA,left defender
"jon",USA,right defender
"sam",USA,left midfielder
"jon",USA,offender
"bob",England,goalie
"julie",England,central midfielder
"jane",England,goalie')

Code:

library(dplyr)

dd %>% distinct(soccer_player,country) %>% 
       count(country)

Get count of group-level observations with multiple individual observations from dataframe in R

Tags:

dataframe

r

goldisfine

2 Answers

Matthew Plourde

Ben Bolker

Recent Activity

Donate For Us

Get count of group-level observations with multiple individual observations from dataframe in R

Tags:

dataframe

r

goldisfine

2 Answers

Matthew Plourde

Ben Bolker

Related questions

Recent Activity

Donate For Us