I'm connecting to my mongodb using pymongo:
client = MongoClient()
mongo = MongoClient('localhost', 27017)
mongo_db = mongo['test']
mongo_coll = mongo_db['test'] #Tweets database
I have a cursor and am looping through every record:
cursor = mongo_coll.find()
for record in cursor: #for all the tweets in the database
try:
msgurl = record["entities"]["urls"] #look for URLs in the tweets
except:
continue
The reason for the try/except
is because if ["entities"]["urls"]
does not exist, it errors out.
How can I determine whether ["entities"]["urls"]
exists?
To determine if a field exists in a particular substructure, use 'isfield' on that substructure instead of the top level. In the example, the value of a.b is itself a structure, and you can call 'isfield' on it. Note: If the first input argument is not a structure array, then 'isfield' returns 0 (false).
TF = isfield( S , field ) returns 1 if field is the name of a field of the structure array S . Otherwise, it returns 0 . If field is an array that contains multiple names and S is a structure array, then TF is a logical array that has the same size. If S is not a structure array, then isfield returns 0 .
Record is a dictionary in which the key "entities"
links to another dictionary, so just check to see if "urls"
is in that dictionary.
if "urls" in record["entities"]:
If you just want to proceed in any case, you can also use get.
msgurl = record["entities"].get("urls")
This will cause msgurl to equal None if there is no such key.
I'm not familiar with pymongo, but why don't you change your query so it only returns results that contain "urls"
? Something like:
mongo_coll.find({"entities.urls": {$exists:1}})
http://docs.mongodb.org/manual/reference/operator/exists/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With