Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating random sequences of DNA

I am trying to generate random sequences of DNA in python using random numbers and random strings. But I am getting only one string as my output. For example: If I give DNA of length 5 (String(5)), I should get an output "CTGAT". Similarly if I give String(4) it should give me "CTGT". But I am getting "G" or "C" or "T" or "A"; i.e. only a single string each time. Could anyone please help me with this??

I tried the following code:

from random import choice
def String(length):

   DNA=""
   for count in range(length):
      DNA+=choice("CGTA")
      return DNA
like image 555
Rachel Avatar asked Dec 04 '22 07:12

Rachel


2 Answers

I'd generate the string all in one go, rather than build it up. Unless Python's being clever and optimising the string additions, it'll reduce the runtime complexity from quadratic to linear.

import random

def DNA(length):
    return ''.join(random.choice('CGTA') for _ in xrange(length))

print DNA(5)
like image 177
Paul Hankin Avatar answered Dec 08 '22 05:12

Paul Hankin


You return too quickly:

from random import choice
def String(length):

   DNA=""
   for count in range(length):
      DNA+=choice("CGTA")
      return DNA

If your return statement is inside the for loop, you will only iterate once --- you will exit out of the function with the return.

From the Python Documentation on return statements: "return leaves the current function call with the expression list (or None) as return value."

So, put the return at the end of your function:

def String(length):

       DNA=""
       for count in range(length):
          DNA+=choice("CGTA")
       return DNA

EDIT: Here's a weighted choice method (it will only work for strings currently, since it uses string repetition).

def weightedchoice(items): # this doesn't require the numbers to add up to 100
    return choice("".join(x * y for x, y in items))

Then, you want to call weightedchoice instead of choice in your loop:

DNA+=weightedchoice([("C", 10], ("G", 20), ("A", 40"), ("T", 30)])

like image 29
Rushy Panchal Avatar answered Dec 08 '22 05:12

Rushy Panchal