I have the below data set: <pre class="prettyprint"><code>DT <- fread(" df1 df2 1 8 2 9 3 10 4 11 5 12") </code></pre> I want to create a new column <code>df3</code> with first value equal to 100 and then <code>lag(df3, 1) * (1 + df2)</code>. So the final output will be: <pre class="prettyprint"><code>df1 df2 df3 1 1 8 100 2 2 9 1000 3 3 10 11000 4 4 11 132000 5 5 12 1716000 </code></pre> I have tried running <code>DT[,df3 := lag(df3, 1) * (1 + df2)]</code> but because <code>df3</code>does not yet exists, so I get an error.

I'm leaving previous answer below as it had some success, but I had overlooked that it would be much faster with <code>cumprod</code> : <pre class="prettyprint"><code>DT$df3 <- 100 * cumprod(c(0,DT$df2[-1])+1) # base R DT[, df3:= 100 * cumprod(c(0,df2[-1])+1)] # data.table DT %>% mutate(df3 = 100 * cumprod(c(0,df2[-1])+1)) # tidyverse (only dplyr here) </code></pre> We compute the cumulated product of <code>df2+1</code>, ignoring the first element and starting with <code>1</code>, and we multiply it by <code>100</code>. <hr> Previous answer with <code>Reduce</code>: This is a good job for <code>Reduce</code>, the function we're using is the simple multiplication, then we make sure to : <ul> <li>add <code>1</code> to <code>df2</code> and ignore the first value. </li> <li>accumulate the results (<code>accumulate = TRUE</code>)</li> </ul> code: <pre class="prettyprint"><code>DT$df3 <- Reduce(`*`,DT$df2[-1]+1,init = 100,accumulate = TRUE) DT # df1 df2 df3 # 1: 1 8 100 # 2: 2 9 1000 # 3: 3 10 11000 # 4: 4 11 132000 # 5: 5 12 1716000 </code></pre> This works with base <code>R</code>, to use more idiomatic syntax with <code>data.table</code> one can follow @jogo's advice and write: <pre class="prettyprint"><code>DT[, df3:=Reduce('*', df2[-1]+1, init = 100,accumulate = TRUE)] </code></pre> And for completeness this would be the <code>tidyverse</code> way: <pre class="prettyprint"><code>library(tidyverse) DT %>% mutate(df3 = accumulate(df2[-1]+1,`*`,.init = 100)) </code></pre>

Create a new column based on column that does not yet exist

Tags:

r

data.table

I have the below data set:

DT <- fread("   df1 df2
  1   8
  2   9
  3  10
  4  11
  5  12")

I want to create a new column df3 with first value equal to 100 and then lag(df3, 1) * (1 + df2). So the final output will be:

df1 df2     df3
1  1  8     100
2  2  9    1000
3  3 10   11000
4  4 11  132000
5  5 12 1716000

I have tried running DT[,df3 := lag(df3, 1) * (1 + df2)] but because df3does not yet exists, so I get an error.

736

asked Jun 05 '18 12:06

Maylo

1 Answers

I'm leaving previous answer below as it had some success, but I had overlooked that it would be much faster with cumprod :

DT$df3 <-  100 * cumprod(c(0,DT$df2[-1])+1)        # base R
DT[, df3:= 100 * cumprod(c(0,df2[-1])+1)]          # data.table
DT %>% mutate(df3 = 100 * cumprod(c(0,df2[-1])+1)) # tidyverse (only dplyr here)

We compute the cumulated product of df2+1, ignoring the first element and starting with 1, and we multiply it by 100.

Previous answer with Reduce:

This is a good job for Reduce, the function we're using is the simple multiplication, then we make sure to :

add 1 to df2 and ignore the first value.
accumulate the results (accumulate = TRUE)

code:

DT$df3 <- Reduce(`*`,DT$df2[-1]+1,init = 100,accumulate = TRUE)
DT
#    df1 df2     df3
# 1:   1   8     100
# 2:   2   9    1000
# 3:   3  10   11000
# 4:   4  11  132000
# 5:   5  12 1716000

This works with base R, to use more idiomatic syntax with data.table one can follow @jogo's advice and write:

DT[, df3:=Reduce('*', df2[-1]+1, init = 100,accumulate = TRUE)]

And for completeness this would be the tidyverse way:

library(tidyverse)
DT %>% mutate(df3 = accumulate(df2[-1]+1,`*`,.init = 100))

115

answered Sep 22 '22 16:09

Moody_Mudskipper

Related questions
                            
                                Select nth observation and sum by group using data.table
                            
                                "object 'day' not found r". But 'day' is a column name [closed]
                            
                                R - Plm and lm - Fixed effects
                            
                                Replace rows with 0s in dataframe with preceding row values diverse than 0
                            
                                ggplot - annotate() - "Discrete value supplied to continuous scale"
                            
                                How to detect an empty quosure in rlang?
                            
                                polygons from coordinates
                            
                                R h2o load a saved model from disk in MOJO or POJO format
                            
                                Relative image paths for Twitter cards in blogdown
                            
                                Find overlapping dates for each ID and create a new row for the overlap
                            
                                shiny dashboard mainpanel height issue
                            
                                Horizontal legend with title on top in ggplot
                            
                                Functional programming with dplyr
                            
                                R time_trans works with objects of class POSIXct
                            
                                How to change colors on barplot?
                            
                                data.table avoid recycling
                            
                                How to group by in base R
                            
                                Filter the middle row of each group
                            
                                Use select_helpers with dplyr::coalesce
                            
                                Replace column values with column name using dplyr's transmute_all

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With