Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using pivot_longer with multiple paired columns in the wide dataset

Tags:

r

tidyr

tidyverse

I have a dataset that looks like this:

input <- 
  data.frame(
    event = 1:2,
    url_1 = c("g1", "g3"),
    name_1 = c("dc", "nyc"),
    url_2 = c("g2", "g4"),
    name_2 = c("sf", "la"))

Essentially there are pairs of indexed columns that are stuck together in wide form. I want to convert to long to give this output:

output <- 
  data.frame(
    event = c(1,1,2,2),
    url = c("g1", "g2", "g3", "g4"),
    name = c("dc", "sf", "nyc", "la"))

I want to do this using pivot_longer. I've tried this:

input %>% 
  pivot_longer(contains("_"))

How can I get the function to recognize the column-pairs?

like image 889
lethalSinger Avatar asked May 21 '20 18:05

lethalSinger


People also ask

How does pivot_ longer work in R?

pivot_longer() makes datasets longer by increasing the number of rows and decreasing the number of columns. I don't believe it makes sense to describe a dataset as being in “long form”. Length is a relative term, and you can only say (e.g.) that dataset A is longer than dataset B.

What is pivot_ longer?

pivot_longer() "lengthens" data, increasing the number of rows and decreasing the number of columns. The inverse transformation is pivot_wider()

How do I combine two columns in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.


1 Answers

You want to use .value in the names_to argument:

input %>%
  pivot_longer(
    -event, 
    names_to = c(".value", "item"), 
    names_sep = "_"
  ) %>% 
  select(-item)

# A tibble: 4 x 3
  event url   name 
  <int> <fct> <fct>
1     1 g1    dc   
2     1 g2    sf   
3     2 g3    nyc  
4     2 g4    la   

From this article on pivoting:

Note the special name .value: this tells pivot_longer() that that part of the column name specifies the “value” being measured (which will become a variable in the output).

like image 168
JasonAizkalns Avatar answered Oct 18 '22 07:10

JasonAizkalns