Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyMongo Aggregation "AttributeError: 'dict' object has no attribute '_txn_read_preference'"

I'm sure there is an error in my code since I'm a newby to pyMongo, but I'll give it a go. The data in MongoDB is 167k+ and is as follows:

{'overall': 5.0,
 'reviewText': {'ago': 1,
                'buy': 2,
                'daughter': 1,
                'holiday': 1,
                'love': 2,
                'niece': 1,
                'one': 2,
                'still': 1,
                'today': 1,
                'use': 1,
                'year': 1},
 'reviewerName': 'dcrm'}

I would like to get a tally of terms used within that reviewText field for all 5.0 ratings. I have run the following code and I get the error that follows. Any insight?

#1 Find the top 20 most common words found in 1-star reviews.

aggr = [{"$unwind": "$reviewText"}, 
        {"$group": { "_id": "$reviewText", "word_freq": {"$sum":1}}}, 
        {"$sort": {"word_freq": -1}},
        {"$limit": 20},
        {"$project": {"overall":"$overall", "word_freq":1}}]
disk_use = { 'allowDiskUse': True }
findings = list(collection.aggregate(aggr, disk_use))

for item in findings:
    p(item)

As you can see, I came across the 'allDiskUse' component since I seemed to exceed the 100MB threshold. But the error that I get is:

AttributeError: 'dict' object has no attribute '_txn_read_preference'
like image 840
Matt_Davis Avatar asked May 01 '26 07:05

Matt_Davis


1 Answers

you are quite close, allowDiskUse is named parameter not a dictionary so the statement should be like this

findings = list(collection.aggregate(aggr, allowDiskUse=True))

or

findings = list(collection.aggregate(aggr, **disk_use ))
like image 142
ManishSingh Avatar answered May 05 '26 10:05

ManishSingh



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!