I have the following data frame:
library(tidyverse)
dat <- structure(list(fasta_header = c(">seq1", ">seq2"), sequence = c("MPSRGTRPE",
"VSSKYTFWNF")), .Names = c("fasta_header", "sequence"), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))
dat
#> # A tibble: 2 x 2
#> fasta_header sequence
#> <chr> <chr>
#> 1 >seq1 MPSRGTRPE
#> 2 >seq2 VSSKYTFWNF
What I want to do is to calculate the frequency of amino acid for every row. The desired result is this (by hand)
fasta_header sequence M P S R G T E V K Y F W N
>seq1 MPSRGTRPE 1 1 1 2 1 1 1 0 0 0 0 0 0
>seq2 VSSKYTFWNF 0 0 2 0 0 1 0 1 1 1 2 1 1
How can I do that with dplyr piping method?
The comments above are right, but if you really want a tidyverse
pipeline...
library(tidyverse) #uses dplyr, purrr, tidyr and stringr
dat %>% mutate(split=map(sequence, ~unlist(str_split(., "")))) %>% #split into characters
unnest() %>% #unnest into a new column
group_by(fasta_header, sequence) %>% #group
count(split) %>% #count letters for each group
spread(key=split, value=n, fill=0) #convert to wide format
fasta_header sequence E F G K M N P R S T V W Y
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 >seq1 MPSRGTRPE 1. 0. 1. 0. 1. 0. 2. 2. 1. 1. 0. 0. 0.
2 >seq2 VSSKYTFWNF 0. 2. 0. 1. 0. 1. 0. 0. 2. 1. 1. 1. 1.
Here you go
library(tidyverse)
library(stringr)
library(dplyr)
dat <- structure(list(fasta_header = c(">seq1", ">seq2"), sequence = c("MPSRGTRPE",
"VSSKYTFWNF")), .Names = c("fasta_header", "sequence"), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))
# Vector of unique amino acids
uniqueaa <- as.character(dat$`sequence`) %>% strsplit(split="") %>%
c() %>% unlist() %>% unique() %>% data.frame(stringsAsFactors = F)
colnames(uniqueaa) <- "uniqueaa"
# Count occurences
result <- apply(uniqueaa,1,function(x) str_count(dat$sequence, x["uniqueaa"]))
colnames(result) <- uniqueaa$uniqueaa
rownames(result) <- dat$sequence
result
M P S R G T E V K Y F W N
MPSRGTRPE 1 2 1 2 1 1 1 0 0 0 0 0 0
VSSKYTFWNF 0 0 2 0 0 1 0 1 1 1 2 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With