Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Eliminating duplicated elements in a list

Tags:

python

list

I was trying chp 10.15 in book Think Python and wrote following codes:

def turn_str_to_list(string):
    res = []
    for letter in string:
        res.append(letter)
    return res

def sort_and_unique (t):
    t.sort()
    for i in range (0, len(t)-2, 1):
        for j in range (i+1, len(t)-1, 1):
            if t[i]==t[j]:
                del t[j]
    return t

line=raw_input('>>>')
t=turn_str_to_list(line)
print t
print sort_and_unique(t)

I used a double 'for' structure to eliminate any duplicated elements in a sorted list. However, when I ran it, I kept getting wrong outputs. if I input 'committee', the output is ['c', 'e', 'i', 'm', 'o', 't', 't'], which is wrong because it still contains double 't'. I tried different inputs, sometimes the program can't pick up duplicated letters in middle of the list, and it always can not pick up the ones at the end. What was I missing? Thanks guys.

like image 970
homerMeng Avatar asked May 08 '26 17:05

homerMeng


1 Answers

The reason why your program isn't removing all the duplicate letters is because the use of del t[j] in the nested for-loops is causing the program to skip letters.

I added some prints to help illustrate this:

def sort_and_unique (t):
    t.sort()
    for i in range (0, len(t)-2, 1):
        print "i: %d" % i
        print t
        for j in range (i+1, len(t)-1, 1):
            print "\t%d %s len(t):%d" % (j, t[j], len(t))
            if t[i]==t[j]:
                print "\tdeleting %c" % t[j]
                del t[j]
    return t

Output:

>>>committee
['c', 'o', 'm', 'm', 'i', 't', 't', 'e', 'e']
i: 0
['c', 'e', 'e', 'i', 'm', 'm', 'o', 't', 't']
        1 e len(t):9
        2 e len(t):9
        3 i len(t):9
        4 m len(t):9
        5 m len(t):9
        6 o len(t):9
        7 t len(t):9
i: 1
['c', 'e', 'e', 'i', 'm', 'm', 'o', 't', 't']
        2 e len(t):9
        deleting e
        3 m len(t):8
        4 m len(t):8
        5 o len(t):8
        6 t len(t):8
        7 t len(t):8
i: 2
['c', 'e', 'i', 'm', 'm', 'o', 't', 't']
        3 m len(t):8
        4 m len(t):8
        5 o len(t):8
        6 t len(t):8
i: 3
['c', 'e', 'i', 'm', 'm', 'o', 't', 't']
        4 m len(t):8
        deleting m
        5 t len(t):7
        6 t len(t):7
i: 4
['c', 'e', 'i', 'm', 'o', 't', 't']
        5 t len(t):7
i: 5
['c', 'e', 'i', 'm', 'o', 't', 't']
i: 6
['c', 'e', 'i', 'm', 'o', 't', 't']
['c', 'e', 'i', 'm', 'o', 't', 't']

Whenever del t[j] is called, the list becomes one element smaller but the inner j variable for-loops keeps iterating.

For example:

i=1, j=2, t = ['c', 'e', 'e', 'i', 'm', 'm', 'o', 't', 't']

It sees that t[1] == t[2] (both 'e') so it removes t[2].

Now t = ['c', 'e', 'i', 'm', 'm', 'o', 't', 't']

However, the code continues with i=1, j=3, which compares 'e' to 'm' and skips over 'i'.

Lastly, it is not catching the last two 't's because by the time i=5, len(t) is 7, so the conditions of the inner for-loop is range(6,6,1) and is not executed.

like image 177
nitekrawler Avatar answered May 10 '26 06:05

nitekrawler



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!