I have researched this but can not find why what I am trying is not working, and will warn that I am somewhat new to python and very new to mongodb. I have a mongo database of tweets in JSON which I am trying to query through Python and pymongo. I want returned the 'text' and 'created_at' fields for all tweets that contain "IP".
I have tried the following, which works perfectly when I do this through the terminal:
db.tweets.find({text:/IP/},{text:1,created_at:1})
In Python, after experimenting I have found that I need to put the field names between quotes. I have gotten the following similar query to work:
cursor = db.tweets.find({'created_at':"Thu Apr 28 09:55:57 +0000 2016"},{'text':1,'created_at':1})
But when I try:
db.tweets.find({"text": /.*IP.*/},{'text':1,'created_at':1})
or
cursor = db.tweets.find({'text':/IP/},{'text':1,'created_at':1})
I get a
'SyntaxError: invalid syntax' at the "/IP/" part of the code.
I am using mongo 3.4.6 and python 3.5.2
PyMongo includes the distinct() function that finds and returns the distinct values for a specified field across a single collection and returns the results in an array. Parameters : key : field name for which the distinct values need to be found.
When a document is inserted a special key, "_id" , is automatically added if the document doesn't already contain an "_id" key. The value of "_id" must be unique across the collection. insert_one() returns an instance of InsertOneResult .
Python PyMongo MongoClient allows Developers to establish a connection between their Python application and MongoDB to manage data in a NoSQL Database. Python PyMongo MongoClient makes it easier for Developers to access all the features of the NoSQL Database and build a scalable and flexible Python application.
The find_One() method of pymongo is used to retrieve a single document based on your query, in case of no matches this method returns nothing and if you doesn't use any query it returns the first document of the collection.
Python does not have special syntax for regexes like JavaScript has.
re
You need to compile the regex with the re
module:
import re
rgx = re.compile('.*IP.*', re.IGNORECASE) # compile the regex
cursor = db.tweets.find({'text':rgx},{'text':1,'created_at':1})
You can use re.IGNORECASE
as flag if you want to match iP
, Ip
and ip
as well. If you do not want that, you can drop the re.IGNORECASE
part.
'$regex'
notationOr you can specify that you are working with a regex with:
cursor = db.tweets.find({'text':{'$regex':'IP'}},{'text':1,'created_at':1})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With