Hi my head is boiling now for 3 days! I want to get all DNA encodings for a peptide: a peptide is a sequence of amino acids i.e. amino acid M
and amino acid Q
can form peptide MQ
or QM
DNA encoding means there is a DNA code (called codon) for each amino acid (for some there are more than one code i.e. amino acid T has 4 different codes / codons)
The last function in the following code is not working so I want some one to make it work for me and please no query integrated language (I forgot its acronym!)`
private string[] CODONS ={
"TTT", "TTC", "TTA", "TTG", "TCT",
"TCC", "TCA", "TCG", "TAT", "TAC", "TGT", "TGC", "TGG", "CTT",
"CTC", "CTA", "CTG", "CCT", "CCC", "CCA", "CCG", "CAT", "CAC",
"CAA", "CAG", "CGT", "CGC", "CGA", "CGG", "ATT", "ATC", "ATA",
"ATG", "ACT", "ACC", "ACA", "ACG", "AAT", "AAC", "AAA", "AAG",
"AGT", "AGC", "AGA", "AGG", "GTT", "GTC", "GTA", "GTG", "GCT",
"GCC", "GCA", "GCG", "GAT", "GAC", "GAA", "GAG", "GGT", "GGC",
"GGA", "GGG", };
private string[] AMINOS_PER_CODON = {
"F", "F", "L", "L", "S", "S",
"S", "S", "Y", "Y", "C", "C", "W", "L", "L", "L", "L", "P", "P",
"P", "P", "H", "H", "Q", "Q", "R", "R", "R", "R", "I", "I", "I",
"M", "T", "T", "T", "T", "N", "N", "K", "K", "S", "S", "R", "R",
"V", "V", "V", "V", "A", "A", "A", "A", "D", "D", "E", "E", "G",
"G", "G", "G", };
public string codonToAminoAcid(String codon)
{
for (int k = 0; k < CODONS.Length; k++)
{
if (CODONS[k].Equals(codon))
{
return AMINOS_PER_CODON[k];
}
}
// never reach here with valid codon
return "X";
}
public string AminoAcidToCodon(String aminoAcid)
{
for (int k = 0; k < AMINOS_PER_CODON .Length; k++)
{
if (AMINOS_PER_CODON [k].Equals(aminoAcid ))
{
return CODONS[k];
}
}
// never reach here with valid codon
return "X";
}
public string GetCodonsforPeptide(string pep)
{
string result = "";
for (int i = 0; i <pep.Length ; i++)
{
result = AminoAcidToCodon(pep.Substring (i,1) );
for (int q = 0; q < pep.Length; q++)
{
result += AminoAcidToCodon(pep.Substring(q, 1));
}
}
return result;
}
Try using the following two methods:
public IEnumerable<string> AminoAcidToCodon(char aminoAcid)
{
for (int k = 0; k < AMINOS_PER_CODON.Length; k++)
{
if (AMINOS_PER_CODON[k] == aminoAcid)
{
yield return CODONS[k];
}
}
}
public IEnumerable<string> GetCodonsforPeptide(string pep)
{
if (string.IsNullOrEmpty(pep))
{
yield return string.Empty;
yield break;
}
foreach (var codon in AminoAcidToCodon(pep[0]))
foreach (var codonOfRest in GetCodonsforPeptide(pep.Substring(1)))
yield return codon + codonOfRest;
}
Notes:
yield return
each matching codon.AMINOS_PER_CODON
array use char
as a type instead. You can easily change the code to use your string array if you want.Example output when passing in "MA"
:
ATGGCT
ATGGCC
ATGGCA
ATGGCG
This is because the M
maps to these:
ATG
and A
maps to these:
GCT
GCC
GCA
GCG
The dictionary I suggest you use would look like this:
var codonsByAminoAcid = new Dictionary<char, string[]>
{
{ 'M', new[] { "ATG" } },
{ 'A', new[] { "GCT", "GCC", "GCA", "GCG" } }
};
This would replace the AminoAcidToCodon
method.
You can even build that dictionary from your two arrays:
var lookup =
CODONS
.Zip(AMINOS_PER_CODON, (codon, amino) => new { codon, amino })
.GroupBy(entry => entry.amino)
.ToDictionary(
g => g.Key,
g => g.Select(ge => ge.codon).ToArray());
The GetCodonsforPeptide
method could then look like this:
public IEnumerable<string> GetCodonsforPeptide(string pep)
{
if (string.IsNullOrEmpty(pep))
{
yield return string.Empty;
yield break;
}
foreach (var codon in lookup(pep[0]))
foreach (var codonOfRest in GetCodonsforPeptide(pep.Substring(1)))
yield return codon + codonOfRest;
}
ie. replace the call to that other method by the lookup table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With