I have relatively tidy data with samples, genes, alleles, and frequencies in separate columns. For each gene and for each sample I need to split out the alleles and their corresponding frequency into separate columns. Here's what I have and what I need.
Trying to do this with dplyr/tidyr, but I'll take any solution I can get.
What I have:
data.frame(sample=rep("sample1", 10),
gene=rep(paste0("gene", 1:5), each=2),
allele=c("A", "G", "A", "C", "A", "T", "C", "G", "G", "T"),
freq=c(.9, .1, .8, .2, .7, .3, .6, .4, .5, .5))
# sample gene allele freq
# 1 sample1 gene1 A 0.9
# 2 sample1 gene1 G 0.1
# 3 sample1 gene2 A 0.8
# 4 sample1 gene2 C 0.2
# 5 sample1 gene3 A 0.7
# 6 sample1 gene3 T 0.3
# 7 sample1 gene4 C 0.6
# 8 sample1 gene4 G 0.4
# 9 sample1 gene5 G 0.5
# 10 sample1 gene5 T 0.5
What I want:
data.frame(sample=rep("sample1", 5),
gene=paste0("gene", 1:5),
allele1=c("A", "A", "A", "C", "G"),
allele2=c("G", "C", "T", "G", "T"),
freq1=c(.9, .8, .7, .6, .5),
freq2=c(.1, .2, .3, .4, .5))
# sample gene allele1 allele2 freq1 freq2
# 1 sample1 gene1 A G 0.9 0.1
# 2 sample1 gene2 A C 0.8 0.2
# 3 sample1 gene3 A T 0.7 0.3
# 4 sample1 gene4 C G 0.6 0.4
# 5 sample1 gene5 G T 0.5 0.5
You can use dcast from the devel version of data.tableie. 1.9.5+ which can take multiple value.var columns. We create a sequence column ('indx') grouped by 'sample' and 'gene'. Then dcast from long to wide format mentioning the value.var columns.
library(data.table)#v1.9.5+
setDT(df)[, indx:=1:.N,.(sample, gene)]
dcast(df, sample+gene~indx, value.var=c('allele', 'freq'), sep= '')
# sample gene allele1 allele2 freq1 freq2
#1: sample1 gene1 A G 0.9 0.1
#2: sample1 gene2 A C 0.8 0.2
#3: sample1 gene3 A T 0.7 0.3
#4: sample1 gene4 C G 0.6 0.4
#5: sample1 gene5 G T 0.5 0.5
NOTE: Instructions to install the devel version are here
The sep='' argument is useful to create the column names as 'allele1', 'allele2' etc. The default would be `allele_1', 'allele_2' etc. (from @Arun's comments)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With