Add only unique values to a list in python

Tags:

list

I'm trying to learn python. Here is the relevant part of the exercise:

For each word, check to see if the word is already in a list. If the word is not in the list, add it to the list.

Here is what I've got.

fhand = open('romeo.txt') output = []  for line in fhand:     words = line.split()     for word in words:         if word is not output:             output.append(word)  print sorted(output)

Here is what I get.

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and',  'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is',  'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'sun',  'the', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder']

Note duplication (and, is, sun, etc).

How do I get only unique values?

940

asked Feb 19 '17 23:02

2 Answers

To eliminate duplicates from a list, you can maintain an auxiliary list and check against.

myList = ['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'and', 'and',       'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'is', 'is', 'kill', 'light',       'moon', 'pale', 'sick', 'soft', 'sun', 'sun', 'the', 'the', 'the',       'through', 'what', 'window', 'with', 'yonder']  auxiliaryList = [] for word in myList:     if word not in auxiliaryList:         auxiliaryList.append(word)

output:

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east',    'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick',   'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']

This is very simple to comprehend and code is self explanatory. However, code simplicity comes on the expense of code efficiency as linear scans over a growing list makes a linear algorithm degrade to quadratic.

If the order is not important, you could use set()

A set object is an unordered collection of distinct hashable objects.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

Since the average case for membership checking in a hash-table is O(1), using a set is more efficient.

auxiliaryList = list(set(myList))

output:

['and', 'envious', 'already', 'fair', 'is', 'through', 'pale', 'yonder',   'what', 'sun', 'Who', 'But', 'moon', 'window', 'sick', 'east', 'breaks',   'grief', 'with', 'light', 'It', 'Arise', 'kill', 'the', 'soft', 'Juliet']

152

answered Sep 27 '22 19:09

Tony Tannous

Instead of is not operator, you should use not in operator to check whether the item is in the list:

if word not in output:

BTW, using set is a lot efficient (See Time complexity):

with open('romeo.txt') as fhand:     output = set()     for line in fhand:         words = line.split()         output.update(words)

UPDATE The set does not preserve the original order. To preserve the order, use the set as an auxiliary data structure:

output = [] seen = set() with open('romeo.txt') as fhand:     for line in fhand:         words = line.split()         for word in words:             if word not in seen:  # faster than `word not in output`                 seen.add(word)                 output.append(word)

answered Sep 27 '22 18:09

falsetru

Related questions
                            
                                Can I add arguments to python code when I submit spark job?
                            
                                Sending email via Gmail & Python
                            
                                Torch sum a tensor along an axis
                            
                                How do you use Keras LeakyReLU in Python?
                            
                                how to get the caller's filename, method name in python
                            
                                Matrix Multiplication in Clojure vs Numpy
                            
                                How to install & configure mod_wsgi for py3
                            
                                Two values from one input in python?
                            
                                How do I mount a filesystem using Python?
                            
                                Putting a `Cookie` in a `CookieJar`
                            
                                How do I determine all of my IP addresses when I have multiple NICs?
                            
                                Removing duplicate strings from a list in python [duplicate]
                            
                                What is a clean way to convert a string percent to a float?
                            
                                In python, is there some kind of mapping to return the "False value" of a type?
                            
                                How do you convert YYYY-MM-DDTHH:mm:ss.000Z time format to MM/DD/YYYY time format in Python?
                            
                                How to make Facebook Login possible in Django app ?
                            
                                shortest python quine?
                            
                                Python - How to check if Redis server is available
                            
                                Getting standard errors on fitted parameters using the optimize.leastsq method in python
                            
                                Python design mistakes [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Add only unique values to a list in python

Tags:

python

list

Tim Elhajj

People also ask

2 Answers

Tony Tannous

falsetru

Recent Activity

Donate For Us