Create a dataframe from a dataframe

Question

I'd like to create a dataframe from a dataframe that created before. my first dataframe is:

    Sample motif chromosome
    1      CT-G.A    1
    1      TA-C.C    1
    1      TC-G.C    2
    2      CG-A.T    2
    2      CA-G.T    2

Then I want to create a dataframe like below, for all (96*24-motifs*chromosomes-):

    Sample CT-G.A,chr1 TA-C.C,chr1 TC-G.C,chr1 CG-A.T,ch1 CA-G.T,ch1 CT-G.A,chr2 TA-C.C,chr2 TC-G.C,chr2 CG-A.T,ch2 CA-G.T,ch2 
    1       1             1           0           0            0        0          0     1    0     0      0      0
    2       0             0           0           0            0        0          0     0    0     0      1      1

Florian · Accepted Answer

Here is a possble solution using dplyr and tidyr.

We add a column value that indicates if a chromosome is present, then complete the data.frame, making sure we have rows for each motif-chromosome-Sample combination, where missing combinations get a 0 in the value column. We create a key out of the motif and chromosome columns, and then discard those columns. Lastly, we reshape the data.frame from long to wide (see here) to get your desired format. Hope this helps!

df = read.table(text="Sample motif chromosome
1      CT-G.A    1
                1      TA-C.C    1
                1      TC-G.C    2
                2      CG-A.T    2
                2      CA-G.T    2
                2      CA-G.T    2",header=T)


library(tidyr)
library(dplyr)

df  %>% mutate(value=1) %>% complete(motif,chromosome,Sample,fill=list(value=0)) %>%
  mutate(key=paste0(motif,',chr',chromosome)) %>%
  group_by(Sample,key) %>%
  summarize(value = sum(value)) %>%
  spread(key,value) %>% 
  as.data.frame

Output:

  Sample CA-G.T,chr1 CA-G.T,chr2 CG-A.T,chr1 CG-A.T,chr2 CT-G.A,chr1 CT-G.A,chr2 TA-C.C,chr1 TA-C.C,chr2 TC-G.C,chr1 TC-G.C,chr2
1      1           0           0           0           0           1           0           1           0           0           1
2      2           0           2           0           1           0           0           0           0           0           0

Create a dataframe from a dataframe

Tags:

dataframe

r

reshape

create-table

bioinformatics

user3585775

1 Answers

Florian

Recent Activity

Donate For Us

Create a dataframe from a dataframe

Tags:

dataframe

r

reshape

create-table

bioinformatics

user3585775

1 Answers

Florian

Related questions

Recent Activity

Donate For Us