Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting differences between two strings

Tags:

python

I'm trying to count the number of differences between two imported strings (seq1 and seq2, import code not listed), but am getting no result when running the program. I want the output to read something like "2 differences." Not sure where I'm going wrong...

def difference (seq1, seq2):    
    count = 0
    for i in seq1:
        if seq1[i] != seq2[i]:
            count += 1
        return (count)
    print (count, "differences")
like image 830
Ryan Scott Avatar asked Feb 10 '15 03:02

Ryan Scott


People also ask

How do you find the difference between two strings in python?

Comparing strings using the == and != The simplest way to check if two strings are equal in Python is to use the == operator. And if you are looking for the opposite, then != is what you need. That's it!

How do I check if two strings have the same characters?

To check if two strings have the same characters:Use the sorted() function to sort the two strings. Use the equality operator to compare the results. If the comparison evaluates to True , the two strings have the same characters.

How do you find the uncommon characters in two strings in python?

Uncommonstring(s1,s2) /* s1 and s2 are two string */ Step 1: Convert both string into set st1 and st2. Step 2: use the intersection of two sets and get common characters. Step 3: now separate out characters in each string which are not common in both string.

How do you find common characters in two strings?

Approach: Count the frequencies of all the characters from both strings. Now, for every character if the frequency of this character in string s1 is freq1 and in string s2 is freq2 then total valid pairs with this character will be min(freq1, freq2). The sum of this value for all the characters is the required answer.


3 Answers

You could do this pretty flatly with a generator expression

count = sum(1 for a, b in zip(seq1, seq2) if a != b)

If the sequences are of a different length, then you may consider the difference in length to be difference in content (I would). In that case, tag on an extra piece to account for it

count = sum(1 for a, b in zip(seq1, seq2) if a != b) + abs(len(seq1) - len(seq2))

Another weirdish way to write that which takes advantage of True being 1 and False being 0 is:

sum(a != b for a, b in zip(seq1, seq2))+ abs(len(seq1) - len(seq2))

zip is a python builtin that allows you to iterate over two sequences at once. It will also terminate on the shortest sequence, observe:

>>> seq1 = 'hi'
>>> seq2 = 'world'
>>> for a, b in zip(seq1, seq2):
...     print('a =', a, '| b =', b)
... 
a = h | b = w
a = i | b = o

This will evaluate similar to sum([1, 1, 1]) where each 1 represents a difference between the two sequences. The if a != b filter causes the generator to only produce a value when a and b differ.

like image 133
Ryan Haining Avatar answered Oct 22 '22 16:10

Ryan Haining


When you say for i in seq1 you are iterating over the characters, not the indexes. You can use enumerate by saying for i, ch in enumerate(seq1) instead.

Or even better, use the standard function zip to go through both sequences at once.

You also have a problem because you return before you print. Probably your return needs to be moved down and unindented.

like image 33
John Zwinck Avatar answered Oct 22 '22 18:10

John Zwinck


in your script there are to mistakes

  1. "i" should be integer, not char
  2. "return" should be in function the same level as print, not in cycle "for"
  3. try not to use "print" in such way in functions

here is working version:

def difference (seq1, seq2):    
    count = 0
    for i in range(len(seq1)):
        if seq1[i] != seq2[i]:
            count += 1
    return (count)
like image 1
breezin Avatar answered Oct 22 '22 18:10

breezin