Context: I'm a CS n00b working my way through "Cracking the Coding Interview." The first problem asks to "implement an algorithm to determine if a string has all unique characters." My (likely naive) implementation is as follows: <pre class="prettyprint"><code>def isUniqueChars2(string): uchars = [] for c in string: if c in uchars: return False else: uchars.append(c) return True </code></pre> The author suggests the following implementation: <pre class="prettyprint"><code>def isUniqueChars(string): checker = 0 for c in string: val = ord(c) - ord('a') if (checker & (1 << val) > 0): return False else: checker |= (1 << val) return True </code></pre> What makes the author's implementation better than mine (FWIW, the author's solution was in Java and I converted it to Python -- is my solution one that is not possible to implement in Java)? Or, more generally, what is desirable in a solution to this problem? What is wrong with the approach I've taken? I'm assuming there are some fundamental CS concepts (that I'm not familiar with) that are important and help inform the choice of which approach to take to this problem.

Here is how I would write this: <pre class="prettyprint"><code>def unique(s): return len(set(s)) == len(s) </code></pre> Strings are iterable so you can pass your argument directly to <code>set()</code> to get a set of the characters from the string (which by definition will not contain any duplicates). If the length of that set is the same as the length of the original string then you have entirely unique characters. Your current approach is fine and in my opinion it is much more Pythonic and readable than the version proposed by the author, but you should change <code>uchars</code> to be a set instead of a list. Sets have O(1) membership test so <code>c in uchars</code> will be considerably faster on average if <code>uchars</code> is a set rather than a list. So your code could be written as follows: <pre class="prettyprint"><code>def unique(s): uchars = set() for c in s: if c in uchars: return False uchars.add(c) return True </code></pre> This will actually be more efficient than my version if the string is large and there are duplicates early, because it will short-circuit (exit as soon as the first duplicate is found).

Implementing an algorithm to determine if a string has all unique characters [closed]

Tags:

python

Context: I'm a CS n00b working my way through "Cracking the Coding Interview." The first problem asks to "implement an algorithm to determine if a string has all unique characters." My (likely naive) implementation is as follows:

def isUniqueChars2(string):
  uchars = []
  for c in string:
    if c in uchars:
      return False
    else:
      uchars.append(c)
  return True

The author suggests the following implementation:

def isUniqueChars(string):
  checker = 0
  for c in string:
    val = ord(c) - ord('a')
    if (checker & (1 << val) > 0):
      return False
    else:
      checker |= (1 << val)
  return True

What makes the author's implementation better than mine (FWIW, the author's solution was in Java and I converted it to Python -- is my solution one that is not possible to implement in Java)? Or, more generally, what is desirable in a solution to this problem? What is wrong with the approach I've taken? I'm assuming there are some fundamental CS concepts (that I'm not familiar with) that are important and help inform the choice of which approach to take to this problem.

596

asked Jun 28 '13 04:06

sparsity

2 Answers

Here is how I would write this:

def unique(s):
    return len(set(s)) == len(s)

Strings are iterable so you can pass your argument directly to set() to get a set of the characters from the string (which by definition will not contain any duplicates). If the length of that set is the same as the length of the original string then you have entirely unique characters.

Your current approach is fine and in my opinion it is much more Pythonic and readable than the version proposed by the author, but you should change uchars to be a set instead of a list. Sets have O(1) membership test so c in uchars will be considerably faster on average if uchars is a set rather than a list. So your code could be written as follows:

def unique(s):
    uchars = set()
    for c in s:
        if c in uchars:
            return False
        uchars.add(c)
    return True

This will actually be more efficient than my version if the string is large and there are duplicates early, because it will short-circuit (exit as soon as the first duplicate is found).

154

answered Oct 22 '22 03:10

Andrew Clark

Beautiful is better than ugly.

Your approach is perfectly fine. This is python, when there are a bajillion ways to do something. (Yours is more beautiful too :)). But if you really want it to be more pythonic and/or make it go faster, you could use a set, as F.J's answer has described.

The second solution just looks really hard to follow and understand.

(PS, dict is a built-in type. Don't override it :p. And string is a module from the standard library.)

answered Oct 22 '22 03:10

TerryA

Related questions
                            
                                How can I map the headers to columns in pandas?
                            
                                No module named 'django.contrib.staticfiles.templatetags'
                            
                                list of methods for python shell?
                            
                                How to ensure that a python dict keys are lowercase?
                            
                                How to customize title bar and window of desktop application
                            
                                Django URL Pattern For Integer
                            
                                Getting <response[200]> with Python http requests instead of INT
                            
                                How do I simulate biased die in python?
                            
                                assigning value to shell variable using a function return value from Python
                            
                                Returning the lowest index for the first non whitespace character in a string in Python
                            
                                Passing arguments inside Scrapy spider through lambda callbacks
                            
                                Is it possible to write one-liners in Python? [closed]
                            
                                In Python, what determines the order while iterating through kwargs?
                            
                                How to construct a ndarray from a numpy array? python
                            
                                Python pandas / matplotlib annotating labels above bar chart columns [duplicate]
                            
                                Num day to Name day with Pandas
                            
                                Reading/Writing out a dictionary to csv file in python
                            
                                Add text to existing PDF document in Python
                            
                                Best idiom to get and set a value in a python dict
                            
                                Open images? Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With