I have a lists of floats with some hidden "level" information encoded in the scale of the float, and I can split the "levels" of floats as such:
import math
import numpy as np
all_scores = [1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23,
6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23,
1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24, 9603539.08653573,
17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801,
31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014,
4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0]
easy, med, hard = [], [], []
for i in all_scores:
if i > math.exp(50):
easy.append(i)
elif i > math.exp(10):
med.append(i)
else:
hard.append(i)
print ([easy, med, hard])
[out]:
[[1.0369411057174144e+22, 2.7997409854370188e+23, 1.296176382146768e+23, 6.7401171871631936e+22, 6.7401171871631936e+22, 2.022035156148958e+24, 8.65845823274041e+23, 1.6435516525621017e+24, 2.307193960221247e+24, 1.285806971089594e+24], [9603539.08653573, 17489013.841076534, 11806185.6660164, 16057293.564414097, 8546268.728385007, 53788629.47091801, 31828243.07349571, 51740168.15200098, 53788629.47091801, 22334836.315934014], [4354.0, 7474.0, 4354.0, 4030.0, 6859.0, 8635.0, 7474.0, 8635.0, 9623.0, 8479.0]]
And I have another list that will correspond to the all_scores
list:
input_scores = [0.0, 2.7997409854370188e+23, 0.0, 6.7401171871631936e+22, 0.0, 0.0, 8.6584582327404103e+23, 0.0, 2.3071939602212471e+24, 0.0, 0.0, 17489013.841076534, 11806185.6660164, 0.0, 8546268.728385007, 0.0, 31828243.073495708, 51740168.152000979, 0.0, 22334836.315934014, 4354.0, 7474.0, 4354.0, 4030.0, 0.0, 8635.0, 0.0, 0.0, 0.0, 8479.0]
I need to check how many of the easy, med and hard matches the all scores, I could do this to get the boolean of whether there's a match on the flatten all_scores
list as such:
matches = [i == j for i, j in zip(input_scores, all_scores)]
print ([i == j for i, j in zip(input_scores, all_scores)])
[out]:
[False, True, False, True, False, False, True, False, True, False, False, True, True, False, True, False, True, True, False, True, True, True, True, True, False, True, False, False, False, True]
Is there a way to know how many easy/med/hard there are in the matches and the sum of the matches per level?
I have tried this and it works:
matches = [int(i == j) for i, j in zip(input_scores, all_scores)]
print(sum(matches[:len(easy)]) , len(easy), sum(np.array(easy) * matches[:len(easy)]) )
print(sum(matches[len(easy):len(easy)+len(med)]), len(med), sum(np.array(med) * matches[len(easy):len(easy)+len(med)]) )
print (sum(matches[len(easy)+len(med):]) , len(hard), sum(np.array(hard) * matches[len(easy)+len(med):]) )
[out]:
4 10 3.52041505391e+24
6 10 143744715.777
6 10 37326.0
But there must be a less verbose way to achieve the same output.
Sounds to me like a job for... Counter!
If you haven't come across it yet, Counter
is like dict, but instead of new values replacing old values in methods like .update()
they just get added onto them. So:
from collections import Counter
counter = Counter({'a': 2})
counter.update({'a': 3})
counter['a']
> 5
So you get your result above with the following code:
from collections import Counter
matches, counts, scores = [
Counter({'easy': 0, 'med': 0, 'hard': 0}) for _ in range(3)
]
for score, inp in zip(all_scores, input_scores):
category = (
'easy' if score > math.exp(50) else
'med' if score > math.exp(10) else
'hard'
)
matches.update({category: score == inp})
counts.update({category: 1})
scores.update({category: score if score == inp else 0})
for cat in ('easy', 'med', 'hard'):
print(matches[cat], counts[cat], scores[cat])
Here is a numpy solution using digitize
to create the categories and bincount
to count and sum the matches. As a free bonus these stats are also created for the left-overs.
categories = 'hard', 'med', 'easy'
# get group membership by splitting at e^10 and e^50
# the 'right' keyword tells digitize to include right boundaries
cat_map = np.digitize(all_scores, np.exp((10, 50)), right=True)
# cat_map has a zero in all the 'hard' places of all_scores
# a one in the 'med' places and a two in the 'easy' places
# add a fourth group to mark all non-matches
# we have to force at least one np.array for element-by-element
# comparison to work
cat_map[np.asanyarray(all_scores) != input_scores] = 3
# count
numbers = np.bincount(cat_map)
# count again, this time using all_scores as weights
sums = np.bincount(cat_map, all_scores)
# print
for c, n, s in zip(categories + ('unmatched',), numbers, sums):
print('{:12} {:2d} {:6.4g}'.format(c, n, s))
# output:
#
# hard 6 3.733e+04
# med 6 1.437e+08
# easy 4 3.52e+24
# unmatched 14 5.159e+24
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With