Tweepy has done a good job extracting all other information (except hashtags) I need by applying the tweepy.Cursor and api.search methods (as shown below). I know from the documentation that Hashtags are under this structure Status > entities > hashtags. And I tried to locate (below) the "hashtags" directory within the methods but to no avail:
print "tweet", dir(tweet)
print "////////////////"
print "tweet._api", dir(tweet._api)
print "////////////////"
print "tweet.text", dir(tweet.text)
print "////////////////"
print "tweet.entities", dir(tweet.entities)
print "////////////////"
print "tweet.author", dir(tweet.author)
print "////////////////"
print "tweet.user", dir(tweet.user)
My code is here:
import tweepy
ckey = ""
csecret = ""
atoken = ""
asecret = ""
OAUTH_KEYS = {'consumer_key':ckey, 'consumer_secret':csecret,
'access_token_key':atoken, 'access_token_secret':asecret}
auth = tweepy.OAuthHandler(OAUTH_KEYS['consumer_key'], OAUTH_KEYS['consumer_secret'])
api = tweepy.API(auth)
for tweet in tweepy.Cursor(api.search, q=('"good book"'), since='2014-09-16', until='2014-09-17').items(5):
print "Name:", tweet.author.name.encode('utf8')
print "Screen-name:", tweet.author.screen_name.encode('utf8')
print "Tweet created:", tweet.created_at
print "Tweet:", tweet.text.encode('utf8')
print "Retweeted:", tweet.retweeted
print "Favourited:", tweet.favorited
print "Location:", tweet.user.location.encode('utf8')
print "Time-zone:", tweet.user.time_zone
print "Geo:", tweet.geo
print "//////////////////"
Step-by-step Approach:Import required modules. Create an explicit function to display tweet data. Create another function to scrape data regarding a given Hashtag using tweepy module. In the Driver Code assign Twitter Developer account credentials along with the Hashtag, initial date and number of tweets.
You use tweepy in exactly the same manner as with a single keyword, but the query parameter q should have your multiple keywords. For example, to search for tweets containing either the word "cupcake" or "donut" you pass in the string "cupcake OR donut" as the q parameter.
Old way vs Cursor way Cursor handles all the pagination work for us behind the scene so our code can now focus entirely on processing the results.
Now you can Download old tweets (https://www.followersanalysis.com/old-tweets) of any Twitter account or related to any hashtag, keyword, or @mention in a CSV/Excel file. Get historical tweets for any date range going back to 2006. Analyze Twitter data and make data-driven marketing decisions.
Get the hashtags
from the entities
dictionary:
print tweet.entities.get('hashtags')
Don't have the rep to comment but to answer Fabian Bosler's question - since entities is a dictionary, try
tweet.entities['hashtags']
That worked for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With