Aggregate or summarize to get ratios

Question

The following is a toy problem that demonstrates my question.

I have a data frame that contains a bunch of employees; for each employee, it has a name, salary, gender and state.

aggregate(salary ~ state)  # Returns the average salary per state
aggregate(salary ~ state + gender, data, FUN = mean)  # Avg salary per state/gender

What I actually need is a summary of the fraction of the total salary earned by women in each state.

aggregate(salary ~ state + gender, data, FUN = sum)

returns the total salary earned by women (and men) in each state ,but what I really need is salary_w / salary_total on a per-state level. I can write a for-loop, etc -- but I am wondering if there is some way to use aggregate to do that.

Chase · Accepted Answer

Another option would be using plyr. ddply() expects a data.frame as an input and will return a data.frame as an output. The second argument is how you want to split the data frame. The third argument is what we want to apply to the chunks, here we are using summarise to create a new data.frame from the existing data.frame.

library(plyr)

#Using the sample data from kohske's answer above

> ddply(d, .(state), summarise, ratio = sum(salary[gender == "Woman"]) / sum(salary))
  state     ratio
1     1 0.5789860
2     2 0.4530224

Aggregate or summarize to get ratios

Tags:

r

aggregate

bsdfish

1 Answers

Chase

Recent Activity

Donate For Us

Aggregate or summarize to get ratios

Tags:

r

aggregate

bsdfish

1 Answers

Chase

Related questions

Recent Activity

Donate For Us