Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate the percentage of each element in a list?

I have this list with 5 sequence of numbers:

['123', '134', '234', '214', '223'] 

and I want to obtain the percentage of each number 1, 2, 3, 4 in the ith position of each sequence of numbers. For example, the numbers at 0th position of this 5 sequences of numbers are 1 1 2 2 2, then I need to calculate the percentage of 1, 2, 3, 4 in this sequence of numbers and return the percentage as 0th element of a new list.

['123', '134', '234', '214', '223']

0th position: 1 1 2 2 2   the percentage of 1,2,3,4 are respectively: [0.4, 0.6, 0.0, 0.0]

1th position: 2 3 3 1 2   the percentage of 1,2,3,4 are respectively: [0.2, 0.4, 0.4, 0.0]

2th position: 3 4 4 4 3   the percentage of 1,2,3,4 are respectively: [0.0, 0.0, 0.4, 0.6]]

Then desired result is to return:

[[0.4, 0.6, 0.0, 0.0], [0.2, 0.4, 0.4, 0.0], [0.0, 0.0, 0.4, 0.6]]

My attempt so far:

list(zip(*['123', '134', '234', '214', '223']))

Result:

 [('1', '1', '2', '2', '2'), ('2', '3', '3', '1', '2'), ('3', '4', '4', '4', '3')]

But I got stuck here, then I don't know how to calculate the percentage of the element of each numbers of 1, 2, 3, 4, then obtain the desired result. Any suggestion is appreciated!

like image 907
Jassy.W Avatar asked Jan 03 '17 18:01

Jassy.W


2 Answers

starting from your approach, you could do the rest with a Counter

from collections import Counter

for item in zip(*['123', '134', '234', '214', '223']):
    c = Counter(item)
    total = sum(c.values())
    percent = {key: value/total for key, value in c.items()}
    print(percent)

    # convert to list
    percent_list = [percent.get(str(i), 0.0) for i in range(5)]
    print(percent_list)

which prints

{'2': 0.6, '1': 0.4}
[0.0, 0.4, 0.6, 0.0, 0.0]
{'2': 0.4, '3': 0.4, '1': 0.2}
[0.0, 0.2, 0.4, 0.4, 0.0]
{'4': 0.6, '3': 0.4}
[0.0, 0.0, 0.0, 0.4, 0.6]
like image 153
hiro protagonist Avatar answered Oct 19 '22 08:10

hiro protagonist


You could start by creating the zipped list as you did:

zipped = zip(*l)

then map an itertools.Counter to it as to get the counts of each item in the results from zip:

counts = map(Counter, zipped)

and then go through it, creating a list out of their counts divided by their sizes:

res = [[c[i]/sum(c.values()) for i in '1234'] for c in counts]
print(res) 
[[0.4, 0.6, 0.0, 0.0], [0.2, 0.4, 0.4, 0.0], [0.0, 0.0, 0.4, 0.6]]

If you are a one-liner kind of person, mush the first two in the comprehension to get this in one line:

res = [[c[i]/sum(c.values()) for i in '1234'] for c in map(Counter, zip(*l))]

additionally, as noted in a comment, if you don't know the elements ahead of time, sorted(set(''.join(l))) could replace '1234'.

like image 29
Dimitris Fasarakis Hilliard Avatar answered Oct 19 '22 08:10

Dimitris Fasarakis Hilliard