Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I compare dates from Twitter data stored in MongoDB via PyMongo?

Are the dates stored in the 'created_at' fields marshaled to Python datetime objects via PyMongo, or do I have to manually replace the text strings with Python Date objects? i.e.

How do I convert a property in MongoDB from text to date type?

It seems highly unnatural that I would have to replace the date strings with Python date objects, which is why I'm asking the question.

I would like to write queries that display the tweets from the past three days. Please let me know if there is a slick way of doing this. Thanks!

like image 551
vgoklani Avatar asked Jan 11 '12 02:01

vgoklani


People also ask

What data type does MongoDB use for storing dates?

The recommended way to store dates in MongoDB is to use the BSON Date data type. The BSON Specification refers to the Date type as the UTC datetime and is a 64-bit integer. It represents the number of milliseconds since the Unix epoch, which was 00:00:00 UTC on 1 January 1970.

How is datetime stored in MongoDB?

MongoDB will store date and time information using UTC internally, but can easily convert to other timezones at time of retrieval as needed.

Can we store date as string in MongoDB?

You can safely store dates as strings and query on them as long as they are properly formatted for date, i.e., “YYYY-MM-ddTHH:mm:ss”.


1 Answers

you can parse Twitter's created_at timestamps to Python datetimes like so:

import datetime, pymongo
created_at = 'Mon Jun 8 10:51:32 +0000 2009' # Get this string from the Twitter API
dt = datetime.strptime(created_at, '%a %b %d %H:%M:%S +0000 %Y')

and insert them into your Mongo collection like this:

connection = pymongo.Connection('mymongohostname.com')
connection.my_database.my_collection.insert({
    'created_at': dt,
    # ... other info about the tweet ....
}, safe=True)

And finally, to get tweets within the last three days, newest first:

three_days_ago = datetime.datetime.utcnow() - datetime.timedelta(days=3)
tweets = list(connection.my_database.my_collection.find({
    'created_at': { '$gte': three_days_ago }
}).sort([('created_at', pymongo.DESCENDING)]))
like image 192
A. Jesse Jiryu Davis Avatar answered Oct 05 '22 23:10

A. Jesse Jiryu Davis