Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a list consisting of vector of different lengths to a usable data frame in R?

I have a (fairly long) list of vectors. The vectors consist of Russian words that I got by using the strsplit() function on sentences.

The following is what head() returns:

[[1]]
[1] "модно"     "создавать" "резюме"    "в"         "виде"     

[[2]]
[1] "ты"        "начианешь" "работать"  "с"         "этими"    

[[3]]
[1] "модно"            "называть"         "блогер-рилейшенз" "―"                "начинается"       "задолго"         

[[4]]
[1] "видел" "по"    "сыну," "что"   "он"   

[[5]]
[1] "четырнадцать," "я"             "поселился"     "на"            "улице"        

[[6]]
[1] "широко"     "продолжали" "род."

Note the vectors are of different length.

What I want is to be able to read the first words from each sentence, the second word, the third, etc.

The desired result would be something like this:

    P1              P2           P3                 P4    P5           P6
[1] "модно"         "создавать"  "резюме"           "в"   "виде"       NA
[2] "ты"            "начианешь"  "работать"         "с"   "этими"      NA
[3] "модно"         "называть"   "блогер-рилейшенз" "―"   "начинается" "задолго"         
[4] "видел"         "по"         "сыну,"            "что" "он"         NA
[5] "четырнадцать," "я"          "поселился"        "на"  "улице"      NA
[6] "широко"        "продолжали" "род."             NA    NA           NA

I have tried to just use data.frame() but that didn't work because the rows are of different length. I also tried rbind.fill() from the plyr package, but that function can only process matrices.

I found some other questions here (that's where I got the plyr help from), but those were all about combining for instance two data frames of different size.

Thanks for your help.

like image 248
Ico Avatar asked Mar 04 '13 12:03

Ico


People also ask

How do you turn a vector into a DataFrame?

For example, if we have a vector x then it can be converted to data frame by using as. data. frame(x) and this can be done for a matrix as well.

How do you create a data frame using an existing vector and list in R?

The Data frame can be converted from vectors in R. To create a data frame in R using the vector, we must first have a series of vectors containing data. The data. frame() function is used to create a data frame from vector in R.

How do I convert a list to a DataFrame in R?

data. frame() can be used to convert a list to R DataFrame or create a data frame from a list. If you want the elements in the list column-wise, then use cbind otherwise you can use rbind.

How is a data frame different from a list?

Lists can have components of the same type or mode, or components of different types or modes. They can hence combine different components (numeric, logical…) in a single object. A Data frame is simply a List of a specified class called “data.


3 Answers

One liner with plyr

plyr::ldply(word.list, rbind) 
like image 103
Ramnath Avatar answered Oct 08 '22 10:10

Ramnath


try this:

word.list <- list(letters[1:4], letters[1:5], letters[1:2], letters[1:6]) n.obs <- sapply(word.list, length) seq.max <- seq_len(max(n.obs)) mat <- t(sapply(word.list, "[", i = seq.max)) 

the trick is, that,

c(1:2)[1:4] 

returns the vector + two NAs

like image 41
adibender Avatar answered Oct 08 '22 10:10

adibender


Another option is stri_list2matrix from library(stringi)

library(stringi)
stri_list2matrix(l, byrow=TRUE)
#    [,1] [,2] [,3] [,4]
#[1,] "a"  "b"  "c"  NA  
#[2,] "a2" "b2" NA   NA  
#[3,] "a3" "b3" "c3" "d3"

NOTE: Data from @juba's post.

Or as @Valentin mentioned in the comments

sapply(l, "length<-", max(lengths(l)))

Or using tidyverse

library(purrr)
library(tidyr)
library(dplyr)
tibble(V = l) %>% 
   unnest_wider(V, names_sep = "")
# A tibble: 3 × 4
  V1    V2    V3    V4   
  <chr> <chr> <chr> <chr>
1 a     b     c     <NA> 
2 a2    b2    <NA>  <NA> 
3 a3    b3    c3    d3   
like image 34
akrun Avatar answered Oct 08 '22 10:10

akrun