Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flatten list column in data frame with ID column

My data frame contains the output of a survey with a select multiple question type. Some cells have multiple values.

df <- data.frame(a=1:3,b=I(list(1,1:2,1:3)))
df
  a       b
1 1       1
2 2    1, 2
3 3 1, 2, 3

I would like to flatten out the list to obtain the following output:

df
  a       b
1 1       1
2 2       1
3 2       2
4 3       1
5 3       2
6 3       3

should be easy but somehow I can't find the search terms. thanks.

like image 508
mloudon Avatar asked May 14 '15 17:05

mloudon


People also ask

How do I flatten a column in a data frame?

Flatten columns: use get_level_values() Flatten columns: use to_flat_index() Flatten columns: join column labels. Flatten rows: flatten all levels.

How do you flatten a column list in Python?

Use as flatten_col(input, 'B', 'B') in your example. The benefit of this method is that copies along all other columns as well (unlike some other solutions).

Can we use list in data frame?

We can create data frames using lists in the dictionary.


2 Answers

You can just use unnest from "tidyr":

library(tidyr)
unnest(df, b)
#   a b
# 1 1 1
# 2 2 1
# 3 2 2
# 4 3 1
# 5 3 2
# 6 3 3
like image 52
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 02 '22 21:10

A5C1D2H2I1M1N2O1R2T1


Using base R, one option is stack after naming the list elements of 'b' column with that of the elements of 'a'. We can use setNames to change the names.

stack(setNames(df$b, df$a))

Or another option would be to use unstack to automatically name the list element of 'b' with 'a' elements and then do the stack to get a data.frame output.

stack(unstack(df, b~a))

Or we can use a convenient function listCol_l from splitstackshape to convert the list to data.frame.

library(splitstackshape)
listCol_l(df, 'b')
like image 30
akrun Avatar answered Oct 02 '22 21:10

akrun