Context: I'm a CS n00b working my way through "Cracking the Coding Interview." The first problem asks to "implement an algorithm to determine if a string has all unique characters." My (likely naive) implementation is as follows:
def isUniqueChars2(string):
uchars = []
for c in string:
if c in uchars:
return False
else:
uchars.append(c)
return True
The author suggests the following implementation:
def isUniqueChars(string):
checker = 0
for c in string:
val = ord(c) - ord('a')
if (checker & (1 << val) > 0):
return False
else:
checker |= (1 << val)
return True
What makes the author's implementation better than mine (FWIW, the author's solution was in Java and I converted it to Python -- is my solution one that is not possible to implement in Java)? Or, more generally, what is desirable in a solution to this problem? What is wrong with the approach I've taken? I'm assuming there are some fundamental CS concepts (that I'm not familiar with) that are important and help inform the choice of which approach to take to this problem.
Check the values of the characters that are next to each other. If the value does not match for all the pairs of characters in a string, then the string has all unique characters.
Python program to check whether the string contains all unique characters. So, in the program, the function 'check_unique' checks the uniqueness of characters in the string. If the string contains all unique characters, this function returns true.
Use the substring() method in C# to check each and every substring for unique characters. Loop it until the length of the string. If any one the substring matches another, then it would mean that the string do not have unique characters.
To find the duplicate character from the string, we count the occurrence of each character in the string. If count is greater than 1, it implies that a character has a duplicate entry in the string. In above example, the characters highlighted in green are duplicate characters.
Here is how I would write this:
def unique(s):
return len(set(s)) == len(s)
Strings are iterable so you can pass your argument directly to set()
to get a set of the characters from the string (which by definition will not contain any duplicates). If the length of that set is the same as the length of the original string then you have entirely unique characters.
Your current approach is fine and in my opinion it is much more Pythonic and readable than the version proposed by the author, but you should change uchars
to be a set instead of a list. Sets have O(1) membership test so c in uchars
will be considerably faster on average if uchars
is a set rather than a list. So your code could be written as follows:
def unique(s):
uchars = set()
for c in s:
if c in uchars:
return False
uchars.add(c)
return True
This will actually be more efficient than my version if the string is large and there are duplicates early, because it will short-circuit (exit as soon as the first duplicate is found).
Your approach is perfectly fine. This is python, when there are a bajillion ways to do something. (Yours is more beautiful too :)). But if you really want it to be more pythonic and/or make it go faster, you could use a set, as F.J's answer has described.
The second solution just looks really hard to follow and understand.
(PS, dict
is a built-in type. Don't override it :p. And string
is a module from the standard library.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With