I have a list <code>lst = [1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,4,4,4,4]</code> I'm expecting the following output: <pre class="prettyprint"><code>out = [1,"","",2,"","","",3,"","","","",4,"","","","","","","",""] </code></pre> I want to keep the first occurrence of the item and replace all other occurrences of the same item with empty strings. I tried the following approach. <pre class="prettyprint"><code>`def splrep(lst): from collections import Counter C = Counter(lst) flst = [ [k,]*v for k,v in C.items()] nl = [] for i in flst: nl1 = [] for j,k in enumerate(i): nl1.append(j) nl.append(nl1) ng = list(zip(flst, nl)) for i,j in ng: j.pop(0) for i,j in ng: for k in j: i[k] = '' final = [i for [i,j] in ng] fin = [i for j in final for i in j] return fin` </code></pre> But I'm looking for some simpler or better approaches.

Use <code>itertools.groupby</code>, quite appropriate for grouping consecutively duplicate values. <pre class="prettyprint"><code>from itertools import groupby [v for k, g in groupby(lst) for v in [k] + [""] * (len(list(g))-1)] # [1, '', '', 2, '', '', '', 3, '', '', '', '', 4, '', '', '', '', '', '', '', ''] </code></pre> If your list values are not consecutive, you may sort them first.

Replace duplicate items from list while keeping the first occurrence

Tags:

python

list

duplicates

I have a list lst = [1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,4,4,4,4]

I'm expecting the following output:

out = [1,"","",2,"","","",3,"","","","",4,"","","","","","","",""]

I want to keep the first occurrence of the item and replace all other occurrences of the same item with empty strings.

I tried the following approach.

`def splrep(lst):
    from collections import Counter
    C = Counter(lst)
    flst = [ [k,]*v for k,v in C.items()]
    nl = []
    for i in flst:
        nl1 = []
        for j,k in enumerate(i):
            nl1.append(j)
        nl.append(nl1)

    ng = list(zip(flst, nl))
    for i,j in ng:
        j.pop(0)
    for i,j in ng:
        for k in j:
            i[k] = ''
    final = [i for [i,j] in ng]
    fin = [i for j in final for i in j]
    return fin`

But I'm looking for some simpler or better approaches.

683

asked Jan 04 '19 10:01

sharathchandramandadi

1 Answers

Use itertools.groupby, quite appropriate for grouping consecutively duplicate values.

from itertools import groupby
[v for k, g in groupby(lst) for v in [k] + [""] * (len(list(g))-1)]
# [1, '', '', 2, '', '', '', 3, '', '', '', '', 4, '', '', '', '', '', '', '', '']

If your list values are not consecutive, you may sort them first.

answered Oct 10 '22 21:10

cs95

Related questions
                            
                                How to make USB camera work with OpenCV?
                            
                                Python set to array and dataframe
                            
                                Manipulate pandas dataframe to display desired output
                            
                                Pylint UnicodeDecodeError utf-8 can't decode byte
                            
                                Lambda + Python + Exit Code
                            
                                Pytest: Only run linter checks (pytest-flake8), don't run tests
                            
                                Upgrading Python to 3.7 inside venv? [duplicate]
                            
                                Tensorflow importing crashes Python without any error on Windows
                            
                                Getting file url after upload amazon s3 python, boto3
                            
                                Python decorator to time recursive functions
                            
                                Anaconda python ver5.3 hangs at update forever
                            
                                How to determine file path in Google colab?
                            
                                Why we use range(len) in for loop in python?
                            
                                Python: How to hide output Chrome messages in Selenium?
                            
                                How to convert a series of tuples into a pandas dataframe?
                            
                                How to Setup Adaptive Learning Rate in Keras
                            
                                Recursively print pyramid of numbers
                            
                                NameError: name 'drive_service' is not defined Google API
                            
                                No such file or directory 'nltk_data/corpora/stopwords/English' when using colab
                            
                                seaborn jointplot color by density

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With