How do I get a dataframe like this:
soccer_player country position
"sam" USA left defender
"jon" USA right defender
"sam" USA left midfielder
"jon" USA offender
"bob" England goalie
"julie" England central midfielder
"jane" England goalie
To look like this (country with the counts of unique players per country):
country player_count
USA 2
England 3
The obvious complication is that there are multiple observations per player, so I cannot simply do table(df$country)
to get the number of observations per country.
I have been playing with the table()
and merge()
functions but have not had any luck.
Here's one way:
as.data.frame(table(unique(d[-3])$country))
# Var1 Freq
# 1 England 3
# 2 USA 2
Drop the third column, remove any duplicate Country-Name pairs, then count the occurrences of each country.
The new features of dplyr v 3.0 provide a compact solution:
Data:
dd <- read.csv(text='
soccer_player,country,position
"sam",USA,left defender
"jon",USA,right defender
"sam",USA,left midfielder
"jon",USA,offender
"bob",England,goalie
"julie",England,central midfielder
"jane",England,goalie')
Code:
library(dplyr)
dd %>% distinct(soccer_player,country) %>%
count(country)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With