Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Take difference between first and last observations in a row, where each row is different

Tags:

r

dplyr

tidyverse

I have data that looks like the following:

  Region X2012 X2013 X2014 X2015 X2016 X2017
1      1    10    11    12    13    14    15
2      2    NA    17    14    NA    23    NA
3      3    12    18    18    NA    23    NA
4      4    NA    NA    15    28    NA    38
5      5    14  18.5    16    27    25    39
6      6    15    NA    17  27.5    NA    39

The numbers are irrelevant here but what I am trying to do is take the difference between the earliest and latest observed points in each row to make a new column for the difference where:

Region              Diff
     1     (15 - 10) = 5
     2     (23 - 17) = 6

and so on, not actually showing the subtraction but the final result. Ideally i would just subtract the 2017 column from the 2012 column but since any row's first observationcould start at any column and also end at any column I am unsure of how to take the difference.

A dplyr solution would be ideal but any solution at all is appreciated.

like image 938
quantumofnolace Avatar asked Jan 26 '23 23:01

quantumofnolace


1 Answers

Define a function which takes the last minus the first element of its vector argument omitting NAs and apply it to each row.

lastMinusFirst <- function(x, y = na.omit(x)) tail(y, 1) - y[1]
transform(DF, diff = apply(DF[-1], 1, lastMinusFirst))

giving:

  Region X2012 X2013 X2014 X2015 X2016 X2017 diff
1      1    10  11.0    12  13.0    14    15    5
2      2    NA  17.0    14    NA    23    NA    6
3      3    12  18.0    18    NA    23    NA   11
4      4    NA    NA    15  28.0    NA    38   23
5      5    14  18.5    16  27.0    25    39   25
6      6    15    NA    17  27.5    NA    39   24

Note

The input in reproducible form:

Lines <- "Region X2012 X2013 X2014 X2015 X2016 X2017
1      1    10    11    12    13    14    15
2      2    NA    17    14    NA    23    NA
3      3    12    18    18    NA    23    NA
4      4    NA    NA    15    28    NA    38
5      5    14  18.5    16    27    25    39
6      6    NA    NA    NA    NA    NA    NA"
DF <- read.table(text = Lines)

Update

Fixed.

like image 76
G. Grothendieck Avatar answered May 05 '23 18:05

G. Grothendieck