Count frequency of item in a list of tuples

Tags:

I have a list of tuples as shown below. I have to count how many items have a number greater than 1. The code that I have written so far is very slow. Even if there are around 10K tuples, if you see below example string appears two times, so i have to get such kind of strings. My question is what is the best way to achieve the count of strings here by iterating over the generator

List:

 b_data=[('example',123),('example-one',456),('example',987),.....]

My code so far:

blockslst=[]
for line in b_data:
    blockslst.append(line[0])

blocklstgtone=[]
for item in blockslst:
    if(blockslst.count(item)>1):
        blocklstgtone.append(item)

420

asked Dec 16 '17 07:12

min2bro

2 Answers

You've got the right idea extracting the first item from each tuple. You can make your code more concise using a list/generator comprehension, as I show you below.

From that point on, the most idiomatic manner to find frequency counts of elements is using a collections.Counter object.

Extract the first elements from your list of tuples (using a comprehension)
Pass this to Counter
Query count of example

from collections import Counter

counts = Counter(x[0] for x in b_data)
print(counts['example'])

Sure, you can use list.count if it’s only one item you want to find frequency counts for, but in the general case, a Counter is the way to go.

The advantage of a Counter is it performs frequency counts of all elements (not just example) in linear (O(N)) time. Say you also wanted to query the count of another element, say foo. That would be done with -

print(counts['foo'])

If 'foo' doesn’t exist in the list, 0 is returned.

If you want to find the most common elements, call counts.most_common -

print(counts.most_common(n))

Where n is the number of elements you want to display. If you want to see everything, don't pass n.

To retrieve counts of most common elements, one efficient way to do this is to query most_common and then extract all elements with counts over 1, efficiently with itertools.

from itertools import takewhile

l = [1, 1, 2, 2, 3, 3, 1, 1, 5, 4, 6, 7, 7, 8, 3, 3, 2, 1]
c = Counter(l)

list(takewhile(lambda x: x[-1] > 1, c.most_common()))
[(1, 5), (3, 4), (2, 3), (7, 2)]

(OP edit) Alternatively, use a list comprehension to get a list of items having count > 1 -

[item[0] for item in counts.most_common() if item[-1] > 1]

Keep in mind that this isn’t as efficient as the itertools.takewhile solution. For example, if you have one item with count > 1, and a million items with count equal to 1, you’d end up iterating over the list a million and one times, when you don’t have to (because most_common returns frequency counts in descending order). With takewhile that isn’t the case, because you stop iterating as soon as the condition of count > 1 becomes false.

138

answered Sep 19 '22 12:09

cs95

First method :

What about without loop ?

print(list(map(lambda x:x[0],b_data)).count('example'))

output:

Second method :

You can calculate using simple dict , without importing any external module or without making it so complex:

b_data = [('example', 123), ('example-one', 456), ('example', 987)]

dict_1={}
for i in b_data:
    if i[0] not in dict_1:
        dict_1[i[0]]=1
    else:
        dict_1[i[0]]+=1

print(dict_1)



print(list(filter(lambda y:y!=None,(map(lambda x:(x,dict_1.get(x)) if dict_1.get(x)>1 else None,dict_1.keys())))))

output:

[('example', 2)]

Test_case :

b_data = [('example', 123), ('example-one', 456), ('example', 987),('example-one', 456),('example-one', 456),('example-two', 456),('example-two', 456),('example-two', 456),('example-two', 456)]

output:

[('example-two', 4), ('example-one', 3), ('example', 2)]

answered Sep 18 '22 12:09

Aaditya Ura

Related questions
                            
                                Best way to convert fractions.Fraction to decimal.Decimal?
                            
                                Python 3 Decimal rounding half down with ROUND_HALF_UP context
                            
                                Intraclass Correlation in Python Module?
                            
                                Docker compose installing requirements.txt
                            
                                Django with NoSQL database
                            
                                pyspark parse fixed width text file
                            
                                Python 3.5 typed NamedTuple syntax produces SyntaxError
                            
                                TimeDistributed vs. TimeDistributedDense Keras
                            
                                Scipy sparse matrix multiplication
                            
                                supply a filename for a file-like object created by urlopen() or requests.get()
                            
                                convert python datetime with timezone to string
                            
                                SGDClassifier vs LogisticRegression with sgd solver in scikit-learn library
                            
                                Python + Ubuntu Linux + nohup error: [1]+ Exit
                            
                                Why doesn't '%matplotlib inline' work in python script?
                            
                                How can I delay the __init__ call until an attribute is accessed?
                            
                                AttributeError: module 'PyQt5.QtGui' has no attribute 'QWidget'
                            
                                How to get predicted values in Keras?
                            
                                what is meaning of hook that used in tensorflow
                            
                                pipenv and bash aliases
                            
                                Pandas - expand nested json array within column in dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Count frequency of item in a list of tuples

Tags:

python

generator

list

python-3.x

tuples

min2bro

People also ask

2 Answers

cs95

Aaditya Ura

Recent Activity

Donate For Us