Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count how many lines in a file are the same?

I have a text document in the format of:

-1+1
-1-1
+1+1
-1-1
+1-1
...

I want to have a program that counts how many lines have -1+1 lines and +1-1 lines. The program would then just need to return the value of how many lines are like this.

I have written the code:

f1 = open("results.txt", "r")
fileOne = f1.readlines()
f1.close()

x = 0
for i in fileOne:
    if i == '-1+1':
        x += 1
    elif i == '+1-1':
        x += 1
    else:
        continue

print x

But for some reason, it always returns 0 and I have no idea why.

like image 690
mrpopo Avatar asked Jan 10 '13 14:01

mrpopo


2 Answers

Use collections.Counter instead:

import collections

with open('results.txt') as infile:
    counts = collections.Counter(l.strip() for l in infile)
for line, count in counts.most_common():
    print line, count

Most of all, remove whitespace (the newline specifically, but any other spaces or tabs might interfere too) when counting your lines.

like image 197
Martijn Pieters Avatar answered Oct 02 '22 19:10

Martijn Pieters


The .readlines() leaves the \n in the lines, that's why they don't match.

like image 31
eumiro Avatar answered Oct 02 '22 18:10

eumiro