How can I find duplicate lines in a text file and print them? [closed]

Question

I have a text file with some 1,200 rows. Some of them are duplicates.

How could I find the duplicate lines in the file (but not worrying about case) and then print out the line's text on the screen, so I can go off and find it? I don't want to delete them or anything, just find which lines they might be.

mgilson · Accepted Answer

This is pretty easy with a set:

with open('file') as f:
    seen = set()
    for line in f:
        line_lower = line.lower()
        if line_lower in seen:
            print(line)
        else:
            seen.add(line_lower)

Ashwini Chaudhary · Answer

as there are only 1200 lines, so you can also use collections.Counter():

>>> from collections import Counter

>>> with open('data1.txt') as f:
...     c=Counter(c.strip().lower() for c in f if c.strip()) #for case-insensitive search
...     for line in c:
...         if c[line]>1:
...             print line
...

if data1.txt is something like this:

ABC
abc
aBc
CAB
caB
bca
BcA
acb

output is:

cab
abc
bca

How can I find duplicate lines in a text file and print them? [closed]

Tags:

python

text

samiles

2 Answers

mgilson

Ashwini Chaudhary

Recent Activity

Donate For Us

How can I find duplicate lines in a text file and print them? [closed]

Tags:

python

text

samiles

2 Answers

mgilson

Ashwini Chaudhary

Related questions

Recent Activity

Donate For Us