Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace several characters in a string using Julia

I'm essentially trying to solve this problem: http://rosalind.info/problems/revc/

I want to replace all occurrences of A, C, G, T with their compliments T, G, C, A .. in other words all A's will be replaced with T's, all C's with G's and etc.

I had previously used the replace() function to replace all occurrences of 'T' with 'U' and was hoping that the replace function would take a list of characters to replace with another list of characters but I haven't been able to make it work, so it might not have that functionality.

I know I could solve this easily using the BioJulia package and have done so using the following:

# creating complementary strand of DNA
# reverse the string
# find the complementary nucleotide
using Bio.Seq
s = dna"AAAACCCGGT"

t = reverse(complement(s))

println("$t")

But I'd like to not have to rely on the package.

Here's the code I have so far, if someone could steer me in the right direction that'd be great.

# creating complementary strand of DNA
# reverse the string
# find the complementary nucleotide

s = open("nt.txt") # open file containing sequence

t = reverse(s) # reverse the sequence

final = replace(t, r'[ACGT]', '[TGCA]') # this is probably incorrect
# replace characters ACGT with TGCA

println("$final")
like image 449
System Avatar asked Mar 03 '16 20:03

System


1 Answers

It seems that replace doesn't yet do translations quite like, say, tr in Bash. So instead, here are couple of approaches using a dictionary mapping instead (the BioJulia package also appears to make similar use of dictionaries):

compliments = Dict('A' => 'T', 'C' => 'G', 'G' => 'C', 'T' => 'A')

Then if str = "AAAACCCGGT", you could use join like this:

julia> join([compliments[c] for c in str])
"TTTTGGGCCA"

Another approach could be to use a function and map:

function translate(c)
    compliments[c]
end

Then:

julia> map(translate, str)
"TTTTGGGCCA"

Strings are iterable objects in Julia; each of these approaches reads one character in turn, c, and passes it to the dictionary to get back the complimentary character. A new string is built up from these complimentary characters.

Julia's strings are also immutable: you can't swap characters around in place, rather you need to build a new string.

like image 109
Alex Riley Avatar answered Sep 28 '22 18:09

Alex Riley