Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding duplicates in list and operating only on one of them

I expanded and added as a new question.

I have a list:

li = [2, 3, 1, 4, 2, 2, 2, 3, 1, 3, 2] 

Then I recognize which value occurs most often, which value I retain in the variable i2:

f = {}
for item in li:
    f[item] = f.get(item, 0) + 1   

for i in f:
    if f[i]==int(max(f.values())):
       i2 = i

Later all values that repeat are increased by 10, but in addition to the maximum values. This is the code that I use:

for i in range(len(li)):
    for x in range(i + 1, len(li)):
        if li[i] == li[x] and li[i] != i2:
           li[x] = li[x] + 10

After this operation I get:

li = [2, 3, 1, 4, 2, 2, 2, 13, 11, 23, 2]

As you can see, the most common value is 2, so it remains unchanged. For example, 3 occurs three times and from all three new values are created 3, 3 + 10, 3 + 20. And the same with the rest of the value (except 2). But if the two maximum values are next to each other (or in a longer sequence), I would like to increase each subsequent in such a sequence one by 10, and get:

li = [2, 3, 1, 4, 2, 12, 22, 13, 11, 23, 2] 

How to do it? I could now do the same on a new loop, but already on the changed list and applying the condition li[i] == li[i+1], but maybe it can be done in the current loop?

like image 883
Tomasz Przemski Avatar asked Dec 15 '17 09:12

Tomasz Przemski


4 Answers

First, you should use a collections.Counter to get the counts of the elements in the list and to find the most_common element.

li = [2, 3, 1, 4, 2, 2, 2, 3, 1, 3, 2] 
l2 = collections.Counter(li).most_common(1)[0][0]

Then, you can use a second Counter for the running counts, and reset those to 0, if the current element is the first occurrence of the most common element. Then use that counter to add multiples of 10 to the number, and afterwards increment it.

running = collections.Counter()
last = None
for i, e in enumerate(li):
    if e == l2 and e != last:
        running[e] = 0
    li[i] = e + 10 * running[e]
    running[e] += 1
    last = e

Afterwards, li is [2, 3, 1, 4, 2, 12, 22, 13, 11, 23, 2]

like image 199
tobias_k Avatar answered Nov 12 '22 04:11

tobias_k


Here is a solution in one block of a nested loop:

import numpy as np
from collections import Counter

li = [2, 3, 1, 4, 2, 2, 2, 3, 1, 3, 2]
#Get most common
i2=Counter(li).most_common()[0][0]

for val in set(li): #loop over all unique values in li
    inds=np.where([i==val for i in li])[0] #get array of indices where li==val
    #special case for i2:
    if val==i2:
        c=1
        for ind in range(1,len(inds)):
            if inds[ind]==inds[ind-1]+1:
                li[inds[ind]]=li[inds[ind]]+10*c
                c+=1
            else:
                c=1
    #not i2:
    else:
        c=1
        for ind in range(1,len(inds)):
            li[inds[ind]]=li[inds[ind]]+10*c
            c+=1

And it returns:

print(li)
[2, 3, 1, 4, 2, 12, 22, 13, 11, 23, 2]

Walkthrough step by step:

Counter is a much quicker way to get i2, we want the zero element which is the value (not the count) of the most common element in list.

The loop then loops over all unique values in the list, first getting the indices in the list where li is equal the value.

Then if val==i2 it initializes multiplier c to 1 and the loop checks for consecutive indices (NB. this loop starts at 1, so first occurrence of any val never touched), if found it increases both the multiplier and the value in li, if not consecutive indices it resets the multiplier to 1.

For all other values it just loops over indices (again from second) increasing the value and the multiplier

like image 40
sandro scodelller Avatar answered Nov 12 '22 04:11

sandro scodelller


I hope I got your question right. Here it is:

from collections import Counter


def fix_pivot(my_list, max_el):
    new_list = []
    inc = 0
    for item in my_list:
        if item == max_el:
            new_list.append(item + inc)
            inc += 10
        else:
            new_list.append(item)
            inc = 0
    return new_list

li = [2, 3, 1, 4, 2, 2, 3, 1, 3, 2]

counted_li = Counter(li)
pivot = counted_li.most_common(1)[0][0]

# operating on all elements except for the most frequent, see note 1
temp = {k:[k + 10*(v-i-1) for i in range(v)] for k, v in counted_li.items()}
new = [temp[k].pop() if k != pivot else k for k in li]

# operating on the most frequent element, see note 2
res = fix_pivot(new, pivot)
print(res)  # -> [2, 3, 1, 4, 2, 12, 13, 11, 23, 2] 

Notes:

  1. based on the frequencies of the elements in the original list li, a dictionary is created (temp) that looks like this:

    {2: [32, 22, 12, 2], 3: [23, 13, 3], 1: [11, 1], 4: [4]}
    

    Combined with the [temp[k].pop() if k != pivot else k for k in li] list comprehension, it results in a very (imo at least, love the poping action) elegant way of getting the first part of the requirements; incrementing all elements that are not the most frequent one.

  2. For the second, bizarre, requirement, the cleanest way to go about it is with a function (again imo). Every time the function meets the most frequent element, it increments the increment (0 -> 10 -> 20) and every time it finds a different one, it resets it to 0.

like image 38
Ma0 Avatar answered Nov 12 '22 05:11

Ma0


Here's my answer, but I think tobias_k's answer is the most elegant yet.

from collections import Counter

li = [2, 3, 1, 4, 2, 2, 3, 1, 3, 2]
c = Counter(li)
mc = max(c, key=c.get)

mapper = {k: 0 for k in li}
out = []
for i, v in enumerate(li):
    if v == mc:
        if i > 0 and li[i - 1] == mc:
            mapper[v] += 10
            out.append(v + mapper[v])
        else:
            mapper[v] = 0
            out.append(v)
    else:
        out.append(v + mapper[v])
        mapper[v] += 10

print(out)
>>> [2, 3, 1, 4, 2, 12, 13, 11, 23, 2]
like image 1
Jurgy Avatar answered Nov 12 '22 05:11

Jurgy