I have the following dummy data:
library(dplyr) library(tidyr) library(reshape2) dt <- expand.grid(Year = 1990:2014, Product=LETTERS[1:8], Country = paste0(LETTERS, "I")) %>% select(Product, Country, Year) dt$value <- rnorm(nrow(dt))
I pick two product-country combinations
sdt <- dt %>% filter((Product == "A" & Country == "AI") | (Product == "B" & Country =="EI"))
and I want to see the values side by side for each combination. I can do this with dcast
:
sdt %>% dcast(Year ~ Product + Country)
Is it possible to do this with spread
from the package tidyr?
To use spread() , pass it the name of a data frame, then the name of the key column in the data frame, and then the name of the value column. Pass the column names as they are; do not use quotes. To tidy table2 , you would pass spread() the key column and then the value column.
The function spread() does the reverse of gather(). It takes two columns (key and value) and spreads into multiple columns.
One option would be to create a new 'Prod_Count' by joining the 'Product' and 'Country' columns by paste
, remove those columns with the select
and reshape from 'long' to 'wide' using spread
from tidyr
.
library(dplyr) library(tidyr) sdt %>% mutate(Prod_Count=paste(Product, Country, sep="_")) %>% select(-Product, -Country)%>% spread(Prod_Count, value)%>% head(2) # Year A_AI B_EI #1 1990 0.7878674 0.2486044 #2 1991 0.2343285 -1.1694878
Or we can avoid a couple of steps by using unite
from tidyr
(from @beetroot's comment) and reshape as before.
sdt%>% unite(Prod_Count, Product,Country) %>% spread(Prod_Count, value)%>% head(2) # Year A_AI B_EI # 1 1990 0.7878674 0.2486044 # 2 1991 0.2343285 -1.1694878
With the new function pivot_wider()
introduced in tidyr version 1.0.0, this can be accomplished with one function call.
pivot_wider()
(counterpart: pivot_longer()
) works similar to spread()
. However, it offers additional functionality such as using multiple key/name columns (and/or multiple value columns). To this end, the argument names_from
—that indicates from which column(s) the names of the new variables are taken—may take more than one column name (here Product
and Country
).
library("tidyr") sdt %>% pivot_wider(id_cols = Year, names_from = c(Product, Country)) %>% head(2) #> # A tibble: 2 x 3 #> Year A_AI B_EI #> <int> <dbl> <dbl> #> 1 1990 -2.08 -0.113 #> 2 1991 -1.02 -0.0546
See also: https://tidyr.tidyverse.org/articles/pivot.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With