Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you check for specific characters in a string?

When I run the program it always prints true. For example, if I enter AAJJ it will print true because is only checking if the first letter is true. can someone point me in the right direction? Thanks!

squence_str = raw_input("Enter either A DNA, Protein or RNA sequence:")

def DnaCheck():

    for i in (squence_str):
        if string.upper(i) =="A":
            return True
        elif string.upper(i) == "T":
            return True
        elif string.upper(i) == "C":
            return True
        elif string.upper(i) == "G":
            return True
        else:
            return False

print "DNA ", DnaCheck()
like image 820
Chief C Avatar asked Dec 23 '22 14:12

Chief C


2 Answers

You need to check that all of the bases in the DNA sequence are valid.

def DnaCheck(sequence):
    dna = set('ACTG')
    return all(base.upper() in dna for base in sequence)

all(...) uses a generator expression to iterate over all the nucleotides in the given DNA sequence, converting each into UPPER case and checking if it is contained in the DNA set {'A', 'C', 'T', 'G'}. If any value is not in this set, the function immediately returns False without processing the remaining characters in sequence, otherwise the function returns True once all characters have been processed and each is in the set.

For example, the sequence "axctgACTGACT" would return False after only processing the first two characters in the sequence, as "x" converted to the uppercase "X" is not in the DNA set {'A','C', 'T', 'G'} and thus the remaining characters in the sequence don't need to be checked.

like image 75
Alexander Avatar answered Jan 22 '23 16:01

Alexander


I like @Alexander's answer, but for variety you could see if

def dna_check(sequence):
    return set(sequence.upper()).issubset("ACGT")
    # another possibility:
    # return set(sequence).issubset("ACGTacgt")

might be faster on long sequences, especially if the odds of being a legal sequence are good (ie most of the time you will have to iterate over the whole sequence anyway).

like image 31
Hugh Bothwell Avatar answered Jan 22 '23 15:01

Hugh Bothwell