Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How use if else in mutate function in R

Tags:

r

I have a data.frame DT_new with 4 columns :

  1. Graduated (Date format)
  2. Work (Date format)
  3. Married (Date format)
  4. Jumlah (Double format)

Sample:

 Graduated         Work      Married   Jumlah
2015-05-01   2015-05-02   2015-05-03       20
        NA   2015-05-02   2015-05-03       20
        NA           NA   2015-05-03       20
        NA   2015-05-02           NA       20  

I need to aggregate Jumlah by date in Graduated or Work or Married

  • when Graduated value is not NA, use date from Graduated
  • when Graduated value is NA , use another value from Work or Married

format what I want is :

     Dates   Total 
2015-05-01      10
2015-05-02      40
2015-05-03      30

I have tried aggregate with group by in R but just group by 1 column (Graduated), such as:

DT_Totals = DT_Total %>%
  group_by(Graduated) %>%
  summarise(Total= sum(Jumlah)) %>%
  arrange(Graduated)

How can I handle my problem?

like image 335
ihsansat Avatar asked Jul 29 '15 04:07

ihsansat


People also ask

Can you use Ifelse in mutate?

Here we need to specify an if…then… else statement. To do so within the mutate() function we use the function called ifelse() . ifelse() evaluates a logical statement specified in the first argument, RT < 200 .

How do I use the mutate function in R?

In R programming, the mutate function is used to create a new variable from a data set. In order to use the function, we need to install the dplyr package, which is an add-on to R that includes a host of cool functions for selecting, filtering, grouping, and arranging data.

How do I mutate a value in R?

To use mutate in R, all you need to do is call the function, specify the dataframe, and specify the name-value pair for the new variable you want to create.

What is the difference between mutate and transmute in R?

mutate() adds new variables and preserves existing ones; transmute() adds new variables and drops existing ones.


2 Answers

You need first create new column and then group over them.

I got function to return first not NA element in vectors defined as:

first_not_na <- function(...) {
    Reduce(list(...), f=function(x,y) {
        x[is.na(x)] <- y[is.na(x)]
        x
    })
}

And you can use it as follow

DT_new %>%
    group_by(Date = first_not_na(Graduated, Work, Married)) %>%
    summarise(Total = sum(Jumlah)) %>%
    arrange(Date)

or splitting to two steps:

DT_new %>%
    mutate(Date = first_not_na(Graduated, Work, Married)) %>%
    group_by(Date) %>%
    summarise(Total = sum(Jumlah)) %>%
    arrange(Date)
like image 194
Marek Avatar answered Oct 05 '22 16:10

Marek


Just create a new date column using ifelse:

DT_new %>% 
  mutate(Date1 = ifelse(!is.na(Graduated), Graduated, ifelse(!is.na(Work), Work, Married))) %>% 
  group_by(Date1) %>%
  summarise(Total = sum(Jumlah)) %>%
  arrange(Date1)

Update

In case the dates are numeric (Date) type:

DT_new %>% 
  mutate(Date1 = ifelse(!is.na(Graduated), Graduated, ifelse(!is.na(Work), Work, Married))) %>% 
  mutate(Date1 = as.Date(Date1, origin = "1970-01-01")) %>% 
  group_by(Date1) %>%
  summarise(Total = sum(Jumlah)) %>%
  arrange(Date1)
like image 45
bergant Avatar answered Oct 05 '22 17:10

bergant