I have tried to find a solution to this for hours. I have tried to search SO, and should I have overlooked an answer for this, please close this as duplicate.
I have a matrix, sorted by transcript_id
, then cond
:
transcript_id cond expr
A1 B1 40
A1 B2 30
A1 B3 20
A2 B2 35
A2 B3 45
A3 B1 23
A4 B1 64
A4 B3 43
I would like a new column, where the fraction of expr
within each transcript_id
is listed:
transcript_id cond expr frac
A1 B1 40 0.4444
A1 B2 30 0.3333
A1 B3 20 0.2222
A2 B2 35 0.4375
A2 B3 45 0.5625
A3 B1 23 1
A4 B1 64 0.5981
A4 B3 43 0.4019
Is there a smart way to achieve this?
My naive approach would be to write a function that loops over every unique element in transcript_id
, but I am stuck.
Note that not every transcript_id
is represented by all three cond
.
One way with data.table
:
library(data.table)
#setDT converts to a data.table and then you calculate the fraction of each expr
#grouping by the transcript_id
setDT(df)[, frac := expr / sum(expr), by=transcript_id]
Output:
> df
transcript_id cond expr frac
1: A1 B1 40 0.4444444
2: A1 B2 30 0.3333333
3: A1 B3 20 0.2222222
4: A2 B2 35 0.4375000
5: A2 B3 45 0.5625000
6: A3 B1 23 1.0000000
7: A4 B1 64 0.5981308
8: A4 B3 43 0.4018692
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With