Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ifelse() function - refer to the following day

I have a dataframe with 2 columns: the date and the return.

df <- tibble( 
date = lubridate::today() +0:9,
return= c(1,2.5,2,3,5,6.5,1,9,3,2))

And now I want to add a third column with an ifelse-condition. If the return on day t is higher than 3.5, than the retrun on the subsequent day t+1 is NA (else = the return on day t).

Here is my desired output:

date         return      retrun_subsequent_day
<date>       <dbl>       <dbl>
1 2019-03-14    1        1
2 2019-03-15    2.5      2.5
3 2019-03-16    2        2
4 2019-03-17    3        3
5 2019-03-18    5        5
6 2019-03-19    6.5      NA
7 2019-03-20    1        NA
8 2019-03-21    9        9
9 2019-03-22    3        NA
10 2019-03-23   2        2

Can someone describe me how can I formulate this condition?

like image 934
TobKel Avatar asked Mar 14 '19 14:03

TobKel


People also ask

What does Ifelse () do in R?

The 'ifelse()' function is the alternative and shorthand form of the R if-else statement. Also, it uses the 'vectorized' technique, which makes the operation faster. All of the vector values are taken as an argument at once rather than taking individual values as an argument multiple times.

What is the syntax of Ifelse () function?

Syntax of ifelse() function This is to say, the i-th element of result will be x[i] if test_expression[i] is TRUE else it will take the value of y[i] . The vectors x and y are recycled whenever necessary.

What can I use instead of Ifelse in R?

Instead of a cumbersomely nested ifelse statement, use dplyr's mutate and case_when functions instead.

How do you define a function in R?

To declare a user-defined function in R, we use the keyword function . The syntax is as follows: function_name <- function(parameters){ function body } Above, the main components of an R function are: function name, function parameters, and function body.


3 Answers

using lag and mutate from dplyr. With lag we compare the return-value of the previous row with 3.5: if it's bigger or equal we take the NA, and if it's smaller we take the return value of the current row

library(dplyr)

df <- df %>% mutate(return_subsequent_day = ifelse(lag(return, default = 0) >= 3.5, NA, return))

output:

# A tibble: 10 x 3
   date       return return_subsequent_day
   <date>      <dbl>                 <dbl>
 1 2019-03-14    1                     1  
 2 2019-03-15    2.5                   2.5
 3 2019-03-16    2                     2  
 4 2019-03-17    3                     3  
 5 2019-03-18    5                     5  
 6 2019-03-19    6.5                  NA  
 7 2019-03-20    1                    NA  
 8 2019-03-21    9                     9  
 9 2019-03-22    3                    NA  
10 2019-03-23    2                     2  
like image 57
brettljausn Avatar answered Nov 04 '22 08:11

brettljausn


A base R approach would be to create a copy of the 'return' as new column 'return_sub', then using the numeric index ('i1'), assign the value to NA

i1 <- which(df$return > 3.5)
df$return_subsequent_day <- df$return
df$return_subsequent_day[pmin(i1 +1, nrow(df))] <- NA
df$return_subsequent_day
#[1] 1.0 2.5 2.0 3.0 5.0  NA  NA 9.0  NA 2.0
like image 34
akrun Avatar answered Nov 04 '22 09:11

akrun


Simple solution using ifelse

df$return_sub_day <- ifelse(dplyr::lag(df$return) > 3.5, NA ,df$return)
df$return_sub_day[1] <- df$return[1]
like image 2
Prince Avatar answered Nov 04 '22 08:11

Prince