I received a flat data, and the values were missing while flatting data. I have to bring the hours up to NAs in hours, based on id, type and Date, so that remove NAs in dollars
id<-c(1,2,1,1,1,2,1)
dollar<-as.numeric(c(100,200,300,500, NA, NA,NA))
hours<-as.numeric(c(NA,NA, NA, NA, 5,10,12))
type<-c("Engineer", "Engineer","Operating","Part", "Engineer","Engineer","Operating" )
Date<-c("2020-01-02","2020-01-03","2020-01-02","2020-01-04", "2020-01-02","2020-01-03","2020-01-02")
id dollar hours type Date
1 1 100 <NA> Engineer 2020-01-02
2 2 200 <NA> Engineer 2020-01-03
3 1 300 <NA> Operating 2020-01-02
4 1 500 <NA> Part 2020-01-04
5 1 <NA> 5 Engineer 2020-01-02
6 2 <NA> 10 Engineer 2020-01-03
7 1 <NA> 12 Operating 2020-01-02
and I would like to reform my data as below.
id dollar hours type Date
1 1 100 5 Engineer 2020-01-02
2 2 200 10 Engineer 2020-01-03
3 1 300 12 Operating 2020-01-02
4 1 500 <NA> Part 2020-01-04
It is not just grouped by id, but matches with type and date. 'id' has categories, 'type' has 17 categories and 'Date' are 3 years.
Please help me on this.
Here is a dplyr
option using summarise
library(dplyr)
df %>%
group_by(id, type, Date) %>%
summarise_at(vars(dollar, hours), ~mean(.x, na.rm = T))
## A tibble: 4 x 5
## Groups: id, type [4]
# id type Date dollar hours
# <dbl> <fct> <fct> <dbl> <dbl>
#1 1 Engineer 2020-01-02 100 5
#2 1 Operating 2020-01-02 300 12
#3 1 Part 2020-01-04 500 NaN
#4 2 Engineer 2020-01-03 200 10
Or even
df %>% group_by(id, type, Date) %>% summarise_all(~mean(.x, na.rm = T))
df <- data.frame(id, dollar, hours, type, Date)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With