Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count frequency of item in a list of tuples

I have a list of tuples as shown below. I have to count how many items have a number greater than 1. The code that I have written so far is very slow. Even if there are around 10K tuples, if you see below example string appears two times, so i have to get such kind of strings. My question is what is the best way to achieve the count of strings here by iterating over the generator

List:

 b_data=[('example',123),('example-one',456),('example',987),.....]

My code so far:

blockslst=[]
for line in b_data:
    blockslst.append(line[0])

blocklstgtone=[]
for item in blockslst:
    if(blockslst.count(item)>1):
        blocklstgtone.append(item)
like image 420
min2bro Avatar asked Dec 16 '17 07:12

min2bro


People also ask

How do you count the occurrences of a particular element in the tuple?

Python Tuple count() Method. Python count() method counts the occurrence of an element in the tuple. It returns the occurrence of the the element passed during call. It required a parameter which is to be counted.

How do you count the frequency of tuple elements explain with example?

Example 1: Python Tuple count() In the above example, we have used the count() method to count the number of times the elements 1 and 7 appear in the tuple. Here, the tuple numbers tuple (1,3,4,1,6,1) contains three 1's and doesn't contain the number 7. Hence, its count in the tuple is 3 and 0 respectively.

How do you count the frequency of an element in a list?

We can use the counter() method from the collections module to count the frequency of elements in a list. The counter() method takes an iterable object as an input argument. It returns a Counter object which stores the frequency of all the elements in the form of key-value pairs.

Does count work on tuples?

Python Tuple count() MethodThe count() method returns the number of times a specified value appears in the tuple.


2 Answers

You've got the right idea extracting the first item from each tuple. You can make your code more concise using a list/generator comprehension, as I show you below.

From that point on, the most idiomatic manner to find frequency counts of elements is using a collections.Counter object.

  1. Extract the first elements from your list of tuples (using a comprehension)
  2. Pass this to Counter
  3. Query count of example
from collections import Counter

counts = Counter(x[0] for x in b_data)
print(counts['example'])

Sure, you can use list.count if it’s only one item you want to find frequency counts for, but in the general case, a Counter is the way to go.


The advantage of a Counter is it performs frequency counts of all elements (not just example) in linear (O(N)) time. Say you also wanted to query the count of another element, say foo. That would be done with -

print(counts['foo'])

If 'foo' doesn’t exist in the list, 0 is returned.

If you want to find the most common elements, call counts.most_common -

print(counts.most_common(n))

Where n is the number of elements you want to display. If you want to see everything, don't pass n.


To retrieve counts of most common elements, one efficient way to do this is to query most_common and then extract all elements with counts over 1, efficiently with itertools.

from itertools import takewhile

l = [1, 1, 2, 2, 3, 3, 1, 1, 5, 4, 6, 7, 7, 8, 3, 3, 2, 1]
c = Counter(l)

list(takewhile(lambda x: x[-1] > 1, c.most_common()))
[(1, 5), (3, 4), (2, 3), (7, 2)]

(OP edit) Alternatively, use a list comprehension to get a list of items having count > 1 -

[item[0] for item in counts.most_common() if item[-1] > 1]

Keep in mind that this isn’t as efficient as the itertools.takewhile solution. For example, if you have one item with count > 1, and a million items with count equal to 1, you’d end up iterating over the list a million and one times, when you don’t have to (because most_common returns frequency counts in descending order). With takewhile that isn’t the case, because you stop iterating as soon as the condition of count > 1 becomes false.

like image 138
cs95 Avatar answered Sep 19 '22 12:09

cs95


First method :

What about without loop ?

print(list(map(lambda x:x[0],b_data)).count('example'))

output:

2

Second method :

You can calculate using simple dict , without importing any external module or without making it so complex:

b_data = [('example', 123), ('example-one', 456), ('example', 987)]

dict_1={}
for i in b_data:
    if i[0] not in dict_1:
        dict_1[i[0]]=1
    else:
        dict_1[i[0]]+=1

print(dict_1)



print(list(filter(lambda y:y!=None,(map(lambda x:(x,dict_1.get(x)) if dict_1.get(x)>1 else None,dict_1.keys())))))

output:

[('example', 2)]

Test_case :

b_data = [('example', 123), ('example-one', 456), ('example', 987),('example-one', 456),('example-one', 456),('example-two', 456),('example-two', 456),('example-two', 456),('example-two', 456)]

output:

[('example-two', 4), ('example-one', 3), ('example', 2)]
like image 29
Aaditya Ura Avatar answered Sep 18 '22 12:09

Aaditya Ura