I have data looking like this: <pre class="prettyprint"><code> SNP Geno Allele marker1 G1 AA marker2 G1 TT marker3 G1 TT marker1 G2 CC marker2 G2 AA marker3 G2 TT marker1 G3 GG marker2 G3 AA marker3 G3 TT </code></pre> And I want it to look like this: <pre class="prettyprint"><code> SNP Geno Allele1 Allele2 marker1 G1 A A marker2 G1 T T marker3 G1 T T marker1 G2 C C marker2 G2 A A marker3 G2 T T marker1 G3 G G marker2 G3 A A marker3 G3 T T </code></pre> I am using this: <pre class="prettyprint"><code>strsplit(Allele, split extended = TRUE) </code></pre> But this is not working. Do I need additional commands?

Another approach, from start to finish: Make reproducible data: <pre class="prettyprint"><code>dat <- read.table(header = TRUE, text = "SNP Geno Allele marker1 G1 AA marker2 G1 TT marker3 G1 TT marker1 G2 CC marker2 G2 AA marker3 G2 TT marker1 G3 GG marker2 G3 AA marker3 G3 TT") </code></pre> UPDATED Extract the Allele column, split it into individual characters, then make those characters into two columns of a data frame: EITHER <pre class="prettyprint"><code>dat1 <- data.frame(t(matrix( unlist(strsplit(as.vector(dat$Allele), split = "")), ncol = length(dat$Allele), nrow = 2))) </code></pre> OR following @joran's suggestion <pre class="prettyprint"><code>dat1 <- data.frame(do.call(rbind, strsplit(as.vector(dat$Allele), split = ""))) </code></pre> THEN Add column names to the new columns: <pre class="prettyprint"><code>names(dat1) <- c("Allele1", "Allele2") </code></pre> Attach the two new columns to columns from the original data table, as @user1317221 suggests: <pre class="prettyprint"><code>dat3 <- cbind(dat$SNP, dat$Geno, dat1) dat$SNP dat$Geno Allele1 Allele2 1 marker1 G1 A A 2 marker2 G1 T T 3 marker3 G1 T T 4 marker1 G2 C C 5 marker2 G2 A A 6 marker3 G2 T T 7 marker1 G3 G G 8 marker2 G3 A A 9 marker3 G3 T T </code></pre>

strsplit one column with exact information into two column

Tags:

split

r

I have data looking like this:

    SNP Geno Allele
marker1   G1    AA
marker2   G1    TT
marker3   G1    TT
marker1   G2    CC
marker2   G2    AA
marker3   G2    TT
marker1   G3    GG
marker2   G3    AA
marker3   G3    TT

And I want it to look like this:

    SNP Geno Allele1 Allele2
marker1   G1       A       A
marker2   G1       T       T
marker3   G1       T       T
marker1   G2       C       C
marker2   G2       A       A
marker3   G2       T       T
marker1   G3       G       G
marker2   G3       A       A
marker3   G3       T       T

I am using this:

strsplit(Allele, split extended = TRUE)

But this is not working. Do I need additional commands?

907

asked May 02 '12 21:05

marie

1 Answers

Another approach, from start to finish:

Make reproducible data:

dat <- read.table(header = TRUE,  text = "SNP Geno    Allele
marker1 G1  AA
marker2 G1  TT
marker3 G1  TT
marker1 G2  CC
marker2 G2  AA
marker3 G2  TT
marker1 G3  GG
marker2 G3  AA
marker3 G3  TT")

UPDATED Extract the Allele column, split it into individual characters, then make those characters into two columns of a data frame:

EITHER

dat1 <- data.frame(t(matrix(
                     unlist(strsplit(as.vector(dat$Allele), split = "")), 
                     ncol = length(dat$Allele), nrow = 2)))

OR following @joran's suggestion

dat1 <- data.frame(do.call(rbind, strsplit(as.vector(dat$Allele), split = "")))

THEN

Add column names to the new columns:

names(dat1) <- c("Allele1", "Allele2")

Attach the two new columns to columns from the original data table, as @user1317221 suggests:

dat3 <- cbind(dat$SNP, dat$Geno, dat1)
        dat$SNP dat$Geno Allele1 Allele2
1 marker1       G1       A       A
2 marker2       G1       T       T
3 marker3       G1       T       T
4 marker1       G2       C       C
5 marker2       G2       A       A
6 marker3       G2       T       T
7 marker1       G3       G       G
8 marker2       G3       A       A
9 marker3       G3       T       T

144

answered Nov 14 '22 21:11

Ben

Related questions
                            
                                How to save frames of gif created using gganimate package
                            
                                Mutating dummy variables in dplyr
                            
                                TwitteR r package: /usr/lib/x86_64-linux-gnu/libcurl.so.4: version `CURL_OPENSSL_3' not found
                            
                                How to run ANOVA on a wide format data.frame?
                            
                                Data Scraping in R
                            
                                R: Create dummy if column includes duplicate given group
                            
                                Dense Rank by Multiple Columns in R
                            
                                Animate ggplot time series plot with a sliding window
                            
                                return ID's of unique combinations
                            
                                applying a function across columns by extracting similar column names
                            
                                How to remove an unnamed element from a single item list?
                            
                                How does one overcome overlapping points without jitter or transparency in ggplot2
                            
                                Convert Twitter Timestamp in R
                            
                                compare adjacent elements of the same vector (avoiding loops)
                            
                                Is there something like a pmax index?
                            
                                replace .. with . in R
                            
                                Find rows in a data frame where two columns are equal
                            
                                add on.exit expr to parent call?
                            
                                ggplot2: plotting order of factors within a geom
                            
                                lm predict won't predict

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With