Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: calculating fraction of values in column, grouped by value in another column

Tags:

r

I have tried to find a solution to this for hours. I have tried to search SO, and should I have overlooked an answer for this, please close this as duplicate.

I have a matrix, sorted by transcript_id, then cond:

transcript_id    cond    expr
A1               B1      40
A1               B2      30
A1               B3      20
A2               B2      35
A2               B3      45
A3               B1      23
A4               B1      64
A4               B3      43

I would like a new column, where the fraction of expr within each transcript_id is listed:

transcript_id    cond    expr   frac
A1               B1      40     0.4444
A1               B2      30     0.3333
A1               B3      20     0.2222
A2               B2      35     0.4375
A2               B3      45     0.5625
A3               B1      23     1
A4               B1      64     0.5981
A4               B3      43     0.4019

Is there a smart way to achieve this?

My naive approach would be to write a function that loops over every unique element in transcript_id, but I am stuck. Note that not every transcript_id is represented by all three cond.

like image 555
Mads Obi Avatar asked Feb 08 '23 16:02

Mads Obi


1 Answers

One way with data.table:

library(data.table)
#setDT converts to a data.table and then you calculate the fraction of each expr
#grouping by the transcript_id
setDT(df)[, frac := expr / sum(expr), by=transcript_id]

Output:

> df
   transcript_id cond expr      frac
1:            A1   B1   40 0.4444444
2:            A1   B2   30 0.3333333
3:            A1   B3   20 0.2222222
4:            A2   B2   35 0.4375000
5:            A2   B3   45 0.5625000
6:            A3   B1   23 1.0000000
7:            A4   B1   64 0.5981308
8:            A4   B3   43 0.4018692
like image 181
LyzandeR Avatar answered Feb 11 '23 07:02

LyzandeR