I'm trying to learn python. Here is the relevant part of the exercise:
For each word, check to see if the word is already in a list. If the word is not in the list, add it to the list.
Here is what I've got.
fhand = open('romeo.txt') output = []  for line in fhand:     words = line.split()     for word in words:         if word is not output:             output.append(word)  print sorted(output)  Here is what I get.
['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and',  'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is',  'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'sun',  'the', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder']  Note duplication (and, is, sun, etc).
How do I get only unique values?
Method 2: Using Set Using set() property of Python, we can easily check for the unique values. Insert the values of the list in a set. Set only stores a value once even if it is inserted more than once. After inserting all the values in the set by list_set=set(list1), convert this set to a list to print it.
unique() function. The unique() function is used to find the unique elements of an array. Returns the sorted unique elements of an array.
With Set. A set only contains unique values.
To eliminate duplicates from a list, you can maintain an auxiliary list and check against.
myList = ['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and',       'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is', 'kill', 'light',       'moon', 'pale', 'sick', 'soft', 'sun', 'sun', 'the', 'the', 'the',       'through', 'what', 'window', 'with', 'yonder']  auxiliaryList = [] for word in myList:     if word not in auxiliaryList:         auxiliaryList.append(word)   output:
['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east',    'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick',   'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']   This is very simple to comprehend and code is self explanatory. However, code simplicity comes on the expense of code efficiency as linear scans over a growing list makes a linear algorithm degrade to quadratic.
If the order is not important, you could use set()
A set object is an unordered collection of distinct hashable objects.
Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.
Since the average case for membership checking in a hash-table is O(1), using a set is more efficient.
auxiliaryList = list(set(myList))   output:
['and', 'envious', 'already', 'fair', 'is', 'through', 'pale', 'yonder',   'what', 'sun', 'Who', 'But', 'moon', 'window', 'sick', 'east', 'breaks',   'grief', 'with', 'light', 'It', 'Arise', 'kill', 'the', 'soft', 'Juliet'] 
                        Instead of is not operator, you should use not in operator to check whether the item is in the list:
if word not in output:   BTW, using set is a lot efficient (See Time complexity):
with open('romeo.txt') as fhand:     output = set()     for line in fhand:         words = line.split()         output.update(words)   UPDATE The set does not preserve the original order. To preserve the order, use the set as an auxiliary data structure:
output = [] seen = set() with open('romeo.txt') as fhand:     for line in fhand:         words = line.split()         for word in words:             if word not in seen:  # faster than `word not in output`                 seen.add(word)                 output.append(word) 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With