Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how can I normalize data frame values by the sum (get percents)

I have the following data frame:

> str(df)
 'data.frame':  52 obs. of  3 variables:
  $ n    : int  10 20 64 108 128 144 256 320 404 512 ...
  $ step : Factor w/ 4 levels "Step1","Step2",..: 1 1 1 1 1 1 1 1 1 1 ...
  $ value: num  0.00178 0.000956 0.001613 0.001998 0.002975 ...

Now I would like to normalize/divide the df$value by the sum of values that belong to the same n i.e. so I can get the percentages. This doesn't work but shows what I would like to achieve. Here I precompute into dfa the sums of the values that belong to the same n and try to divide on the original df$value by the aggregated total dfa$value with matching n:

dfa <- aggregate(x=df$value, by=list(df$n), FUN=sum)
names(dfa)[names(dfa)=="Group.1"] <- "n"           
names(dfa)[names(dfa)=="x"] <- "value"
df$value <- df$value / dfa[dfa$n==df$n,][[1]]
like image 237
SkyWalker Avatar asked Aug 27 '12 16:08

SkyWalker


1 Answers

I think the following works, using package data.table.

df <- data.table(df)
df[,value2 := value/sum(value),by=n]
like image 199
Blue Magister Avatar answered Nov 15 '22 07:11

Blue Magister