I have the following data:
1 3 4 2 6 7 8 8 93 23 45 2 0 0 0 1
0 3 4 2 6 7 8 8 90 23 45 2 0 0 0 1
0 3 4 2 6 7 8 6 93 23 45 2 0 0 0 1
-1 3 4 2 6 7 8 8 21 23 45 2 0 0 0 1
-1 3 4 2 6 7 8 8 0 23 45 2 0 0 0 1
The above data is in a file. I want to count the number of 1's,0's,-1's but only in 1st column. I am taking the file in standard input but the only way I could think of is to do like this:
cnt = 0
cnt1 = 0
cnt2 = 0
for line in sys.stdin:
(t1, <having 15 different variables as that many columns are in files>) = re.split("\s+", line.strip())
if re.match("+1", t1):
cnt = cnt + 1
if re.match("-1", t1):
cnt1 = cnt1 + 1
if re.match("0", t1):
cnt2 = cnt2 + 1
How can I make it better especially the 15 different variables part as thats the only place where I will be using those variables.
Use collections.Counter
:
from collections import Counter
with open('abc.txt') as f:
c = Counter(int(line.split(None, 1)[0]) for line in f)
print c
Output:
Counter({0: 2, -1: 2, 1: 1})
Here str.split(None, 1)
splits the line just once:
>>> s = "1 3 4 2 6 7 8 8 93 23 45 2 0 0 0 1"
>>> s.split(None, 1)
['1', '3 4 2 6 7 8 8 93 23 45 2 0 0 0 1']
Numpy makes it even easy:
>>> import numpy as np
>>> from collections import Counter
>>> Counter(np.loadtxt('abc.txt', usecols=(0,), dtype=np.int))
Counter({0: 2, -1: 2, 1: 1})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With