Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr: divide all values in group by group's first value

Tags:

r

dplyr

My df looks something like this:

ID    Obs    Value
1     1      26
1     2      13
1     3      52
2     1      1,5
2     2      30

Using dplyr, I to add the additional column Col, which is the result of a division of all values in the column value by the group's first value in that column.

ID    Obs    Value    Col
1     1      26       1
1     2      13       0,5
1     3      52       2
2     1      1,5      1
2     2      30       20

How do I do that?

like image 635
TIm Haus Avatar asked Jan 01 '23 09:01

TIm Haus


2 Answers

After grouping by 'ID', use mutate to create a new column by dividing the 'Value' by the first of 'Value'

library(dplyr)
df1 %>%
   group_by(ID) %>%
   mutate(Col = Value/first(Value))

If the first 'Value' is 0 and we don't want to use it, then subset the 'Value' with a logical expression and then take the first of that

df1 %>%
   group_by(ID) %>%
    mutate(Col = Value/first(Value[Value != 0]))

Or in base R

df1$Col <- with(df1, Value/ave(Value, ID, FUN = head, 1))

NOTE: The comma in 'Value' suggests it is a character column. In that case, it should be first changed to decimal (.) if that is the case, convert to nunmeric and then do the division. It can be done while reading the data

like image 113
akrun Avatar answered Jan 05 '23 04:01

akrun


Or, without creating an additional column:

library(tidyverse)

df = data.frame(ID=c(1,1,1,2,2), Obs=c(1,2,3,1,2), Value=c(26, 13, 52, 1.5, 30))

df %>% 
  group_by(ID) %>% 
  mutate_at('Value', ~./first(.))
#> # A tibble: 5 x 3
#> # Groups:   ID [2]
#>      ID   Obs Value
#>   <dbl> <dbl> <dbl>
#> 1     1     1   1  
#> 2     1     2   0.5
#> 3     1     3   2  
#> 4     2     1   1  
#> 5     2     2  20

### OR ###
df %>% 
  group_by(ID) %>% 
  mutate_at('Value', function(x) x/first(x))
#> # A tibble: 5 x 3
#> # Groups:   ID [2]
#>      ID   Obs Value
#>   <dbl> <dbl> <dbl>
#> 1     1     1   1  
#> 2     1     2   0.5
#> 3     1     3   2  
#> 4     2     1   1  
#> 5     2     2  20

Created on 2020-01-04 by the reprex package (v0.3.0)

like image 24
abalter Avatar answered Jan 05 '23 04:01

abalter