Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you sum consecutive values in a column and create a new column of those summed values?

Tags:

r

In a data frame I am trying to calculate the total thickness at certain intervals within a stratigraphic column, and create new coulmns of these total thicknesses in the data frame. I am new to using r, and I am struggling with how to add up consecutive values in a column.

Any help or insight would be appreciated. Even suggested readings or help pages.

I am trying to calculate the top and base of specific beds for a stratigraphic column. Essentially, the total thickness at specific intervals in a stratigraphic column. I have the thickness of each bed and I want to calculate where the top and base would be in relation to the very bottom of the entire column. I have been struggling to come up with a solution, I feel like I simply do not know enough functions and commands to allow me to write the code that will do what I need. I also think I may need to create a function, in order to do what I need.

This is the data I am starting with. The lithology or rock type, and the total thickness of each bed or rock type. The last row is the absolute base of the statigraphic column, which therefore has no thickness.

Lithology  Thickness
     sand          4
      mud          1
     sand          5
      mud          3
      mud          5
     sand          2
   bottom          0

What I hope to do is create two new columns in which I calculate the height/top of each rock type and the base of each rock type to end up with a data frame like the one below.

I want to add/sum the thicknesses to calculate the top and base of each lithology, in reference to the bottom.

So, to calculate the top of middle sand, I want to sum the thicknesses of all previous lithologies including the middle sand. And then to calculate the base, I want to sum the thickness of all previous lithologies excluding the middle sand. And I want to do this for every lithology.

Lithology  Thickness Top Base
     sand          4  20   16
      mud          1  16   15
     sand          5  15   10
      mud          3  10    7
      mud          5   7    2
     sand          2   2    0
   bottom          0   0    0

Any help is greatly appreciated, thank you for your time!

like image 381
Jay May Avatar asked Mar 04 '23 05:03

Jay May


1 Answers

In these alternatives we use the input shown reproducibly in the Note at the end.

1) within The base is the sum of the thicknesses minus the thicknesses up to that point which we can calculate using cumsum. The top is that plus the current thickness. No packages are used.

within(DF, {
  Base <- sum(Thickness) - cumsum(Thickness)
  Top <- Base + Thickness
})

giving:

  Lithology Thickness Top Base
1      sand         4  20   16
2       mud         1  16   15
3      sand         5  15   10
4       mud         3  10    7
5       mud         5   7    2
6      sand         2   2    0
7    bottom         0   0    0

2) transform Top is the total thickness minus the thicknesses up to that point except for the current thickness. The Base is the same except for the last term. No packages are used.

transform(DF, 
  Top = sum(Thickness) - cumsum(Thickness) + Thickness,
  Base = sum(Thickness) - cumsum(Thickness))

2a) To make use of prior calculations we can iterate transform:

transform(
  transform(DF, Top = sum(Thickness) - cumsum(Thickness) + Thickness),
  Base = Top - Thickness)

2b) or do it like this:

Base <- with(DF, sum(Thickness) - cumsum(Thickness))
transform(DF, Top = Base + Thickness, Base = Base)

3) dplyr With dplyr each component can use the calculations already done to the left so one can write:

library(dplyr)

DF %>%
  mutate(Top = sum(Thickness) - cumsum(Thickness) + Thickness,
         Base = Top - Thickness)

4) gsubfn Using transform2 in the gsubfn package each component calculated can depend on any of the others and it will automatically determine the dependencies and carry out the calculations in the correct order.

library(gsubfn)

transform2(DF, 
  Top = Base + Thickness,
  Base = sum(Thickness) - cumsum(Thickness))

Note

Lines <- "Lithology  Thickness
     sand          4
      mud          1
     sand          5
      mud          3
      mud          5
     sand          2
   bottom          0"
DF <- read.table(text = Lines,  header = TRUE, as.is = TRUE)
like image 127
G. Grothendieck Avatar answered Mar 05 '23 19:03

G. Grothendieck