I am working with some government sourced data that has a format that can be replicated with this code:
Names <- c(rep("A",5),rep("B",5), rep("C",5))
Types <- c(rep(c("Income","Tax Paid","Consumption","Stimulus","NonDurable expenses"),3))
year1 <- c(rep(c(1000,100,300,100,200),3))
year2 <- c(rep(c(2000,200,600,200,400),3))
year3 <- c(rep(c(4000,400,800,400,800),3))
df <- data.frame(Names,Types,year1,year2,year3)
What I wanted was to have a new dataframe that would have
| Name | year | Income | Tax | Consumption | Stimulus | NonDurable |
|---|---|---|---|---|---|---|
| A | year1 | 1000 | 100 | 300 | 100 | 200 |
| A | year2 | 1000 | 100 | 300 | 100 | 200 |
| A | year3 | 1000 | 100 | 300 | 100 | 200 |
| B | year1 | 2000 | 200 | 600 | 200 | 400 |
| B | year2 | 2000 | 200 | 600 | 200 | 400 |
| B | year3 | 2000 | 200 | 600 | 200 | 400 |
Which goes on like that. So, for every person, have the table long by year and use cell names in Types as new columns with corresponding values.
I have thought of one way out of handling this: filter out each by Type and have separate data frames for each type then pivot_longer, then bind_cols the results. But I think this is inefficient.
Here is my code:
df %>% filter(Types == "Income") %>%
pivot_longer(cols=c(year1:year3), names_to = "Year", values_to = "income") %>% select(-Types) %>%
bind_cols(df %>% filter(Types == "Tax Paid") %>%
pivot_longer(cols=c(year1:year3), names_to = "Year", values_to = "Tax_paid") %>% select(Tax_paid)) %>%
bind_cols(df %>% filter(Types == "Consumption") %>%
pivot_longer(cols=c(year1:year3), names_to = "Year", values_to = "Consumption") %>% select(Consumption)) %>%
bind_cols(df %>% filter(Types == "Stimulus") %>%
pivot_longer(cols=c(year1:year3), names_to = "Year", values_to = "Stimulus") %>% select(Stimulus)) %>%
bind_cols(df %>% filter(Types == "NonDurable expenses") %>%
pivot_longer(cols=c(year1:year3), names_to = "Year", values_to = "NonDurable_expenses") %>% select(NonDurable_expenses))
Can someone help with a better and efficient code?
I think this does what you want
library(tidyr); library(dplyr)
(df
|> pivot_longer(year1:year3, names_to = "year")
|> pivot_wider(names_from = Types)
|> rename_with(~ stringr::str_remove(., " +.*"))
)
(The extra parentheses around the whole expression are an odd formatting/stylistic preference of mine; doing it this way allows each line to start with a pipe, rather than having to make sure they end with a pipe ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With