Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a column of derived data to a data.frame

Tags:

dataframe

r

Suppose I have a simple table of sales data

> df<-data.frame(country=c("A", "A", "B", "B"), outlet=c(1,2,1,2), sales=c(300, 900,10,40))
> df
  country outlet sales
1       A      1   300
2       A      2   900
3       B      1    10
4       B      2    40

and would like to add a column showing the fraction of all sales in that country contributed by each outlet. I can do this with a split, iterating then recombining using rbind, but this looks quite ugly to me

> do.call("rbind",lapply(split(df, df$country), function(x) { x$frac <- NA; tot<-sum(x$sales); for (o in x$outlet) {s<-x[x$outlet== o,]$sales; x[x$outlet == o,]$frac <- s/tot}; return(x)}))
    country outlet sales frac
A.1       A      1   300 0.25
A.2       A      2   900 0.75
B.3       B      1    10 0.20
B.4       B      2    40 0.80

Is there an cleaner way of doing this simple task (other than writing a function for it which merely sweeps the ugliness into a script)?

(and for bonus points, is there a way of preventing rbind from adding row names like A.1 to the resulting data.frame?)

like image 709
ScarletPumpernickel Avatar asked Dec 20 '25 19:12

ScarletPumpernickel


2 Answers

Another alternative:

df$frac <- df$sales / ave(df$sale, df$country, FUN = sum)
df
#  country outlet sales frac
#1       A      1   300 0.25
#2       A      2   900 0.75
#3       B      1    10 0.20
#4       B      2    40 0.80
like image 189
alexis_laz Avatar answered Dec 22 '25 11:12

alexis_laz


Here is an easier way

x <- tapply(df$sales, df$country, sum) #total sales by country
df$frac <- df$sales/x[match(df$country, names(x), nomatch=-1)] 
df
like image 24
ndr Avatar answered Dec 22 '25 11:12

ndr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!