Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove the temporal component in an aggregation of a tsibble object?

I have been working with the tsibble package and I can't get how is the proper way to remove the time component from the aggregation result. So in the following dataset, I want to have the mean trips by Region and State. Is the proper way to convert the tsibble to a tibble (it might be, I am just not sure) or is there some option that I am missing to achieve the aggregation?

library(tsibble)
library(dplyr)

tourism %>% group_by(Region, State) %>% summarise(Mean_trips = mean(Trips))

# A tsibble: 6,080 x 4 [1Q]
# Key:       Region, State [76]
# Groups:    Region [76]
   Region   State           Quarter Mean_trips
   <chr>    <chr>             <qtr>      <dbl>
 1 Adelaide South Australia 1998 Q1       165.
 2 Adelaide South Australia 1998 Q2       112.
 3 Adelaide South Australia 1998 Q3       148.

## This is not what I want, this is what I want:

tourism %>% as_tibble %>% group_by(Region, State) %>% summarise(Mean_trips = mean(Trips))

# A tibble: 76 x 3
# Groups:   Region [76]
   Region                     State              Mean_trips
   <chr>                      <chr>                   <dbl>
 1 Adelaide                   South Australia        143.  
 2 Adelaide Hills             South Australia          7.18
like image 929
User2321 Avatar asked Mar 03 '23 14:03

User2321


2 Answers

If we use select(-Quarter) on tourism data it gives an informative error message.

library(tsibble)
library(dplyr)

tourism %>% select(-Quarter)

Error: Column Quarter (index) can't be removed. Do you need as_tibble() to work with data frame?

Hence, as_tibble is the correct way to convert to tibble.

tourism %>% 
    as_tibble %>% 
    group_by(Region, State) %>% 
    summarise(Mean_trips = mean(Trips))

#   Region                     State              Mean_trips
#   <chr>                      <chr>                   <dbl>
# 1 Adelaide                   South Australia        143.  
# 2 Adelaide Hills             South Australia          7.18
# 3 Alice Springs              Northern Territory      14.2 
# 4 Australia's Coral Coast    Western Australia       47.4 
#...
like image 80
Ronak Shah Avatar answered Mar 05 '23 15:03

Ronak Shah


For the sake of completeness: from the reference manual of tsibble

Column-wise verbs, including select(), transmute(), summarise(), mutate() & transmute(), keep the time context hanging around. That is, the index variable cannot be dropped for a tsibble. If any key variable is changed, it will validate whether it’s a tsibble internally. Use as_tibble() to leave off the time context.

The temporal component cannot be dropped and as_tibble() is the right choice to convert to a tibble.

like image 21
captcoma Avatar answered Mar 05 '23 16:03

captcoma