Remove Duplicates from Text File

Question

I want to remove duplicate word from a text file.

i have some text file which contain such like following:

None_None

ConfigHandler_56663624
ConfigHandler_56663624
ConfigHandler_56663624
ConfigHandler_56663624

None_None

ColumnConverter_56963312
ColumnConverter_56963312

PredicatesFactory_56963424
PredicatesFactory_56963424

PredicateConverter_56963648
PredicateConverter_56963648

ConfigHandler_80134888
ConfigHandler_80134888
ConfigHandler_80134888
ConfigHandler_80134888

The resulted output needs to be:

None_None

ConfigHandler_56663624

ColumnConverter_56963312

PredicatesFactory_56963424

PredicateConverter_56963648

ConfigHandler_80134888

I have used just this command: en=set(open('file.txt') but it does not work.

Could anyone help me with how to extract only the unique set from the file

Thank you

StuGrey · Accepted Answer

Here is a simple solution using sets to remove the duplicates from the text file.

lines = open('workfile.txt', 'r').readlines()

lines_set = set(lines)

out  = open('workfile.txt', 'w')

for line in lines_set:
    out.write(line)

Jon Clements · Answer

Here's about option that preserves order (unlike a set), but still has the same behaviour (note that the EOL character is deliberately stripped and blank lines are ignored)...

from collections import OrderedDict

with open('/home/jon/testdata.txt') as fin:
    lines = (line.rstrip() for line in fin)
    unique_lines = OrderedDict.fromkeys( (line for line in lines if line) )

print unique_lines.keys()
# ['None_None', 'ConfigHandler_56663624', 'ColumnConverter_56963312',PredicatesFactory_56963424', 'PredicateConverter_56963648', 'ConfigHandler_80134888']

Then you just need to write the above to your output file.

Remove Duplicates from Text File

Tags:

python

string

duplicates

Kaushik

2 Answers

StuGrey

Jon Clements

Recent Activity

Donate For Us

Remove Duplicates from Text File

Tags:

python

string

duplicates

Kaushik

2 Answers

StuGrey

Jon Clements

Related questions

Recent Activity

Donate For Us