Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking if two strings are permutations of each other in Python

Tags:

I'm checking if two strings a and b are permutations of each other, and I'm wondering what the ideal way to do this is in Python. From the Zen of Python, "There should be one -- and preferably only one -- obvious way to do it," but I see there are at least two ways:

sorted(a) == sorted(b) 

and

all(a.count(char) == b.count(char) for char in a) 

but the first one is slower when (for example) the first char of a is nowhere in b, and the second is slower when they are actually permutations.

Is there any better (either in the sense of more Pythonic, or in the sense of faster on average) way to do it? Or should I just choose from these two depending on which situation I expect to be most common?

like image 946
Kiv Avatar asked Dec 28 '08 17:12

Kiv


People also ask

Is permutation for string in Python?

For example, the string ABC has 6 permutations, i.e., ABC, ACB, BAC, BCA, CBA, CAB . In Python, we can use the built-in module itertools to get permutations of elements in the list using the permutations() function. However, we can also write your utility function to generate all permutations of a string.

How do you count permutations of a string?

We can find the count without finding all permutation. Idea is to find all the characters that is getting repeated, i.e., frequency of all the character. Then, we divide the factorial of the length of string by multiplication of factorial of frequency of characters.


2 Answers

Here is a way which is O(n), asymptotically better than the two ways you suggest.

import collections  def same_permutation(a, b):     d = collections.defaultdict(int)     for x in a:         d[x] += 1     for x in b:         d[x] -= 1     return not any(d.itervalues())  ## same_permutation([1,2,3],[2,3,1]) #. True  ## same_permutation([1,2,3],[2,3,1,1]) #. False 
like image 124
namin Avatar answered Sep 23 '22 07:09

namin


"but the first one is slower when (for example) the first char of a is nowhere in b".

This kind of degenerate-case performance analysis is not a good idea. It's a rat-hole of lost time thinking up all kinds of obscure special cases.

Only do the O-style "overall" analysis.

Overall, the sorts are O( n log( n ) ).

The a.count(char) for char in a solution is O( n 2 ). Each count pass is a full examination of the string.

If some obscure special case happens to be faster -- or slower, that's possibly interesting. But it only matters when you know the frequency of your obscure special cases. When analyzing sort algorithms, it's important to note that a fair number of sorts involve data that's already in the proper order (either by luck or by a clever design), so sort performance on pre-sorted data matters.

In your obscure special case ("the first char of a is nowhere in b") is this frequent enough to matter? If it's just a special case you thought of, set it aside. If it's a fact about your data, then consider it.

like image 26
S.Lott Avatar answered Sep 22 '22 07:09

S.Lott