Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I do the following comparison without having to write 20 if-statements or making 20 lists/dictionaries?

This problem is related to biology, so for those who know what amino acids and codons are, that's great! For those who don't, I have tried my best to phrase it so that you can understand what I am talking about.

So I have a list of codons, also can be called 3-letter strings, that are composed of a combination of the following four letters: A, G, C, T i.e. AAT, GAT, GCT, etc. Each codon corresponds to a particular amino acid, but there are multiple codons that can correspond to the same amino acid. To illustrate this, please take a look at this link: http://www.cbs.dtu.dk/courses/27619/codon.html. That should make it clear.

For each codon on my list, I want to ultimately find out which amino acid it corresponds to. Therefore, I must make the program first compare that codon to the list of codons (64 total possible codons) that I posted the link for, and then I have to ask the program to see which amino acid that codon corresponds to. However, I can't figure out a shortcut method of doing this without having to make a list for all codons that correspond to a given amino acid and compare them, or write 20 different if statements.

The list of codons I have is called mutated_codon. So, I will need to generate a 'for' where the program compares each codon on my mutated_codon list and compares it to the dictionary and outputs the corresponding Amino acid letter. What is the code that I have to write to do that? I'm not familiar with the syntax used to check for values in a dictionary.

Here's what I've go so far based on suggestions:

codon_lookup = {'GCT': 'A', 'GCC': 'A','GCA': 'A','GCG': 'A', 'TGT': 'C','TGC':'C', 'GAT':'D','GAC': 'D', 'GAA':'E','GAG': 'E', 'TTT':'F','TTC': 'F', 'GGT': 'G','GGC': 'G','GGA':'G','GGG': 'G', 'CAT':'H','CAC': 'H', 'ATT':'I','ATC':'I','ATA':'I','AAA':'K','AAG':'K', 'TTA': 'L','TTG': 'L','CTT': 'L','CTC': 'L','CTA': 'L','CTG': 'L', 'ATG': 'M', 'AAT':'N','AAC':'N', 'CCT': 'P','CCC': 'P','CCA': 'P','CCG': 'P', 'CAA': 'Q','CAG': 'Q', 'CGT': 'R','CGC': 'R','CGA': 'R','CGG': 'R','AGA': 'R','AGG': 'R', 'TCT': 'S','TCC': 'S','TCA': 'S','TCG': 'S','AGT': 'S','AGC': 'S', 'ACT': 'T','ACC': 'T','ACA': 'T','ACG': 'T', 'GTT': 'V','GTC': 'V','GTA': 'V','GTG': 'V', 'TGG' = 'W', 'TAT':'Y', 'TAC':'Y', 'TAA': 'Z', 'TAG': 'Z', 'TGA':'Z'}

for c in mutated_codon:
   print codon_lookup[c]

HOwever, in my output I only get output for the amino acid that corresponds to the last codon on the list, and on top of that, I get KeyError: 4. Any ideas what could be wrong?

like image 574
bioprogrammer Avatar asked Nov 29 '22 00:11

bioprogrammer


1 Answers

You can set a dictionary up like this:

codon_lookup = {
    'ATT':'Isoleucine',
    'ATC':'Isoleucine', 
    'ATA':'Isoleucine',
    'CTT':'Leucine',
    'CTC':'Leucine', 
    'CTA':'Leucine',
     # ... etc
} 

then you can make queries like

codon_lookup['ATT']

Which will give you

'Isoleucine'

EDIT:

You can set a dictionary up like this:

codon_lookup = {
    'ATT':'I',
    'ATC':'I', 
    'ATA':'I',
    'CTT':'L',
    'CTC':'L', 
    'CTA':'L',
     # ... etc
} 

then you can make queries like

codon_lookup['ATT']

Which will give you

'I'

If you want to check your list of mutated_condons against this dictionary you can loop through it like this. If your mutated_condons list looks like ['ACA','GTT',...] then:

for mutated_codon in mutated_condons:
    print codon_lookup[mutated_codon]
like image 110
Farmer Joe Avatar answered Dec 10 '22 08:12

Farmer Joe