I have data arranged like this in R: <pre class="prettyprint"><code>indv time val A 6 5 A 10 10 A 12 7 B 8 4 B 10 3 B 15 9 </code></pre> For each individual (<code>indv</code>) at each time, I want to calculate the change in value (<code>val</code>) from the initial time. So I would end up with something like this: <pre class="prettyprint"><code>indv time val val_1 val_change A 6 5 5 0 A 10 10 5 5 A 12 7 5 2 B 8 4 4 0 B 10 3 4 -1 B 15 9 4 5 </code></pre> Can anyone tell me how I might do this? I can use <pre class="prettyprint"><code>ddply(df, .(indv), function(x)x[which.min(x$time), ]) </code></pre> to get a table like <pre class="prettyprint"><code>indv time val A 6 5 B 8 4 </code></pre> However, I cannot figure out how to make a column <code>val_1</code> where the minimum values are matched up for each individual. However, if I can do that, I should be able to add column <code>val_change</code> using something like: <pre class="prettyprint"><code>df['val_change'] = df['val_1'] - df['val'] </code></pre> EDIT: two excellent methods were posted below, however both rely on my time column being sorted so that small time values are on top of high time values. I'm not sure this will always be the case with my data. (I know I can sort first in Excel, but I'm trying to avoid that.) How could I deal with a case when the table appears like this: <pre class="prettyprint"><code>indv time value A 10 10 A 6 5 A 12 7 B 8 4 B 10 3 B 15 9 </code></pre>

Here is a <code>data.table</code> solution that will be memory efficient as it is setting by reference within the data.table. Setting the key will sort by the key variables <pre class="prettyprint"><code>library(data.table) DT <- data.table(df) # set key to sort by indv then time setkey(DT, indv, time) DT[, c('val1','change') := list(val[1], val - val[1]),by = indv] # And to show it works.... DT ## indv time val val1 change ## 1: A 6 5 5 0 ## 2: A 10 10 5 5 ## 3: A 12 7 5 2 ## 4: B 8 4 4 0 ## 5: B 10 3 4 -1 ## 6: B 15 9 4 5 </code></pre>

How to calculate difference from initial value for each group in R?

Tags:

r

I have data arranged like this in R:

indv    time    val
A          6    5
A         10    10
A         12    7
B          8    4
B         10    3
B         15    9

For each individual (indv) at each time, I want to calculate the change in value (val) from the initial time. So I would end up with something like this:

indv time   val val_1   val_change
A       6     5    5       0
A      10    10    5       5
A      12     7    5       2
B       8     4    4       0
B      10     3    4      -1
B      15     9    4       5

Can anyone tell me how I might do this? I can use

ddply(df, .(indv), function(x)x[which.min(x$time), ])

to get a table like

indv    time    val
A          6    5   
B          8    4

However, I cannot figure out how to make a column val_1 where the minimum values are matched up for each individual. However, if I can do that, I should be able to add column val_change using something like:

df['val_change'] = df['val_1'] - df['val']

EDIT: two excellent methods were posted below, however both rely on my time column being sorted so that small time values are on top of high time values. I'm not sure this will always be the case with my data. (I know I can sort first in Excel, but I'm trying to avoid that.) How could I deal with a case when the table appears like this:

indv    time    value
A          10   10
A           6   5
A          12   7
B           8   4
B          10   3
B          15   9

907

asked Nov 14 '12 21:11

Thomas

2 Answers

Here is a data.table solution that will be memory efficient as it is setting by reference within the data.table. Setting the key will sort by the key variables

library(data.table)
DT <- data.table(df)  
# set key to sort by indv then time
setkey(DT, indv, time)
DT[, c('val1','change') := list(val[1], val - val[1]),by = indv]
# And to show it works....
DT
##    indv time val val1 change
## 1:    A    6   5    5      0
## 2:    A   10  10    5      5
## 3:    A   12   7    5      2
## 4:    B    8   4    4      0
## 5:    B   10   3    4     -1
## 6:    B   15   9    4      5

165

answered Sep 22 '22 01:09

mnel

Here's a plyr solution using ddply

ddply(df, .(indv), transform, 
      val_1 = val[1],
      change = (val - val[1]))

  indv time val val_1 change
1    A    6   5     5      0
2    A   10  10     5      5
3    A   12   7     5      2
4    B    8   4     4      0
5    B   10   3     4     -1
6    B   15   9     4      5

To get your second table try this:

ddply(df, .(indv), function(x) x[which.min(x$time), ])
  indv time val
1    A    6   5
2    B    8   4

Edit 1

To deal with unsorted data, like the one you posted in your edit try the following

unsort <- read.table(text="indv    time    value
A          10   10
A           6   5
A          12   7
B           8   4
B          10   3
B          15   9", header=T)


do.call(rbind, lapply(split(unsort, unsort$indv), 
                  function(x) x[order(x$time), ]))
    indv time value
A.2    A    6     5
A.1    A   10    10
A.3    A   12     7
B.4    B    8     4
B.5    B   10     3
B.6    B   15     9

Now you can apply the procedure described above to this sorted dataframe

Edit 2

A shorter way to sort your dataframe is using sortBy function from doBy package

library(doBy)
orderBy(~ indv + time, unsort)
  indv time value
2    A    6     5
1    A   10    10
3    A   12     7
4    B    8     4
5    B   10     3
6    B   15     9

Edit 3

You can even sort your df using ddply

ddply(unsort, .(indv, time), sort)
  value time indv
1     5    6    A
2    10   10    A
3     7   12    A
4     4    8    B
5     3   10    B
6     9   15    B

answered Sep 21 '22 01:09

Jilber Urbina

Related questions
                            
                                Better way to get a frequency table for continuous data (R)?
                            
                                Discarding a single attribute in R
                            
                                Limiting the time that a function processes in an R for loop
                            
                                Setting "an informative User-Agent string" in getURL
                            
                                R: selecting items matching criteria from a vector
                            
                                ggplot2 plot 3 factors with some x-axis jigging
                            
                                Setting default number of decimal places for printing
                            
                                How does the subset argument work in the lm() function?
                            
                                How to set the "zero" oritentation in ggplot2 of R?
                            
                                ggplot2: highlight chart area
                            
                                How to elegantly convert datetime from decimal to "%d.%m.%y %H:%M:%S"?
                            
                                Create "arty" mosaic pictures with R (*not* statistical mosaic plots)
                            
                                How to automatically shrink down row numbers in R data frame when removing rows in R
                            
                                How to mine for motifs in R with iGraph
                            
                                Plotting neural network model from nnet package R cran
                            
                                How to Increase the thickness of the box lines in an R boxplot?
                            
                                Count how many times the element is repeated in a sequence (in R)
                            
                                Pairwise interaction matrix in R
                            
                                Add greek letters to axis tick labels in R
                            
                                Reshaping data to plot in R using ggplot2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With