Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r - merge and melt list to data frame

Tags:

r

Situation & data

I have a data frame of employees df_employees:

df_employees <- structure(list(empNo = c(1001, 1002, 1003)), .Names = "empNo", row.names = c(NA, 
  -3L), class = "data.frame")

> df_employees
  empNo
1  1001
2  1002
3  1003

and a list of skills l_skills

l_skills <- list(c("skill1", "skill2", "skill3"), c("skill1", "skill2"), 
             "skill1")


> l_skills
[[1]]
[1] "skill1" "skill2" "skill3"

[[2]]
[1] "skill1" "skill2"

[[3]]
[1] "skill1"

Question

How do I merge and melt the data to give me a resulting dataframe df_result

> df_result
  empNo skills
1  1001 skill1
2  1001 skill2
3  1001 skill3
4  1002 skill1
5  1002 skill2
6  1003 skill1

Attempts

I thought I could use a similar approach to this cSplit function, but I get an error when trying to install cSplit

> install.packages("cSplit")
Installing package into ‘C:/Users/<username>/Documents/R/win-library/3.1’
(as ‘lib’ is unspecified)
Warning in install.packages :
  package ‘cSplit’ is not available (for R version 3.1.2)
like image 880
tospig Avatar asked Jan 08 '23 22:01

tospig


2 Answers

You can make use of the melt function from the reshape2 package:

library(reshape2)

L <- l_skills
names(L) <- df_employees$empNo

result <- melt(L)
colnames(result) <- c('skills','empNo')

result
#   skills empNo
# 1 skill1  1001
# 2 skill2  1001
# 3 skill3  1001
# 4 skill1  1002
# 5 skill2  1002
# 6 skill1  1003

Base R solution:

do.call(rbind,mapply(cbind,df_employees$empNo,l_skills))
#       [,1]   [,2]    
#[1,] "1001" "skill1"
#[2,] "1001" "skill2"
#[3,] "1001" "skill3"
#[4,] "1002" "skill1"
#[5,] "1002" "skill2"
#[6,] "1003" "skill1"
like image 85
Marat Talipov Avatar answered Jan 11 '23 17:01

Marat Talipov


You could also use stack from base R

 setNames(stack(setNames(L, df_employees$empNo)), c('skills', 'empNo'))
 #   skills empNo
 #1 skill1  1001
 #2 skill2  1001
 #3 skill3  1001
 #4 skill1  1002
 #5 skill2  1002
 #6 skill1  1003

Or using splitstackshape

 library(splitstackshape)
 listCol_l(transform(df_employees, skills=I(L)), 'skills')[]
 #   empNo skills_ul
 #1:  1001    skill1
 #2:  1001    skill2
 #3:  1001    skill3
 #4:  1002    skill1
 #5:  1002    skill2
 #6:  1003    skill1

where

 L <- l_skills
like image 35
akrun Avatar answered Jan 11 '23 19:01

akrun