This is the code I have set up so far :
library(dslabs)
library(dplyr)
library(lubridate)
data("reported_heights")
dat <- mutate(reported_heights, date_time = ymd_hms(time_stamp)) %>%
filter(date_time >= make_date(2016, 01, 25) & date_time < make_date(2016, 02, 1)) %>%
mutate(type = ifelse(day(date_time) == 25 & hour(date_time) == 8 & between(minute(date_time), 15, 30), "inclass","online")) %>%
select(sex, type, time_stamp)
y <- factor(dat$sex, c("Female", "Male"))
x <- dat$type
counter <- count(dat, sex,type)
It creates for me a tbl_df that looks like this, link below :
sex | type | n
1 Female | inclass | 26
2 Male | inclass | 13
3 Female | online | 42
4 Male | online | 69
I am asking if you can help me with a code that will calculate the proportion of each sex in each type of class.
I have been trying to create a new table using the x characters "inclass" and "online" as columns with a proportion column added and then the y factors "male" and "female" would be the rows. I have been trying to do this using pull()
and prop.table()
but I am a total newbie and it would mean the world to me if you beautiful experts can help me. I have been going through answers for hours now and maybe the answer is already out there so please excuse that I can't seem to find it.... Thank you so much.
What is the proportion of the sexes(male&female) in each type of class(inclass&online)?
It's possible to calculate this by dividing the sex with the total number of students in a given type of class.
For example: There are 42 females studying online out of the total (42+69)=111. Answer: In the online class 38% are females.
How can we do this in R ?
To create a frequency table in R, we can simply use table function but the output of table function returns a horizontal table. If we want to read the table in data frame format then we would need to read the table as a data frame using as. data. frame function.
Using prop.table()
:
prop.table(table(y, x), 2)
# x
#y inclass online
# Female 0.6666667 0.3783784
# Male 0.3333333 0.6216216
You may use table()
,
my.table <- with(dat, table(sex, type))
my.table
# type
# sex inclass online
# Female 26 42
# Male 13 69
and apply()
a function on the result.
res <- apply(my.table, 2, function(x) x/sum(x)*100)
res
# type
# sex inclass online
# Female 66.66667 37.83784
# Male 33.33333 62.16216
To get a nicer output you could round()
then and add %
.
res2 <- as.data.frame(unclass(round(res, 1)))
res2[] <- lapply(res2, paste0, "%")
res2
# inclass online
# Female 66.7% 37.8%
# Male 33.3% 62.2%
To get the proportion in each class, we can use ave
in base R
df$prop <- with(df, n/ave(n, type, FUN = sum)) * 100
df
# sex type n prop
#1 Female inclass 26 66.66667
#2 Male inclass 13 33.33333
#3 Female online 42 37.83784
#4 Male online 69 62.16216
The same can be achieved with dplyr
library(dplyr)
df %>% group_by(type) %>% mutate(prop = n/sum(n) * 100)
and data.table
library(data.table)
setDT(df)[, prop := n/sum(n) * 100, by = type]
data
df <- structure(list(sex = structure(c(1L, 2L, 1L, 2L), .Label = c("Female",
"Male"), class = "factor"), type = structure(c(1L, 1L, 2L, 2L
), .Label = c("inclass", "online"), class = "factor"), n = c(26L,
13L, 42L, 69L)), class = "data.frame", row.names = c(NA, -4L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With