Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract tweets location which contain specific keyword using twitter API in Python

I am trying to extract the all tweets which contain specific keyword and its geo locations .

for example , I want download all the tweets in english which contains the keyword 'iphone' from 'france' and 'singapore'

My code

import tweepy
import csv
import pandas as pd
import sys

# API credentials here
consumer_key = 'INSERT CONSUMER KEY HERE'
consumer_secret = 'INSERT CONSUMER SECRET HERE'
access_token = 'INSERT ACCESS TOKEN HERE'
access_token_secret = 'INSERT ACCESS TOKEN SECRET HERE'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)

# Search word/hashtag value 
HashValue = ""

# search start date value. the search will start from this date to the current date.
StartDate = ""

# getting the search word/hashtag and date range from user
HashValue = input("Enter the hashtag you want the tweets to be downloaded for: ")
StartDate = input("Enter the start date in this format yyyy-mm-dd: ")

# Open/Create a file to append data
csvFile = open(HashValue+'.csv', 'a')

#Use csv Writer
csvWriter = csv.writer(csvFile)

for tweet in tweepy.Cursor(api.search,q=HashValue,count=20,lang="en",since=StartDate, tweet_mode='extended').items():
    print (tweet.created_at, tweet.full_text)
    csvWriter.writerow([tweet.created_at, tweet.full_text.encode('utf-8')])

print ("Scraping finished and saved to "+HashValue+".csv")
#sys.exit()

How can this be done.

like image 341
Rahul rajan Avatar asked Jul 03 '19 03:07

Rahul rajan


People also ask

How do I find tweets from a specific location?

Use the "near:" and "within:" operators to find Tweets within a certain distance of a place. For example, if you wanted to find marketers within 10 miles of Manhattan, you'd put marketers near:Manhattan within:10mi in Twitter's search box.


1 Answers

-Hello- Rahul

As I understand it you are looking to get geo data off searched tweets rather then filter search based on geocode.

Here is a code sample with the relevant fields you are interested in. These may or may not be provided depending on the tweeters privacy settings.

Note there is no "since" parameter on the search API:

https://tweepy.readthedocs.io/en/latest/api.html#help-methods

https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets

Standard twitter api search goes back 7 days. The premium and enterprise APIs have 30 day search as well as Full Archive search, but you will pay $$$.

Unfortunately tweepy still hasn't had their models documented:

https://github.com/tweepy/tweepy/issues/720

So if you want to look at the tweet object you can use pprint package and run:

pprint(tweet.__dict__)

One difference I noticed was the "text" field in the JSON became "full_text" in the object.

There's also information on the original tweet in there if the one you found was a quote tweet, has the same info from what I could see.

Anyway here's the code, I added a max tweet count for looping through the cursor while I was testing to avoid blowing any API limits.

Let me know if you want csv code but it looks like you can handle that already.

import tweepy

# API credentials here
consumer_key = 'your-info'
consumer_secret = 'your-info'
access_token = 'your-info'
access_token_secret = 'your-info'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)

searchString = "iPhone"

cursor = tweepy.Cursor(api.search, q=searchString, count=20, lang="en", tweet_mode='extended')

maxCount = 1
count = 0
for tweet in cursor.items():    
    print()
    print("Tweet Information")
    print("================================")
    print("Text: ", tweet.full_text)
    print("Geo: ", tweet.geo)
    print("Coordinates: ", tweet.coordinates)
    print("Place: ", tweet.place)
    print()

    print("User Information")
    print("================================")
    print("Location: ", tweet.user.location)
    print("Geo Enabled? ", tweet.user.geo_enabled)

    count = count + 1
    if count == maxCount:
        break;

Will output something like this:

Tweet Information
================================
Text:  NowPlaying : Hashfinger - Leaving
https://derp.com

#iPhone free app https://derp.com
#peripouwebradio
Geo:  None
Coordinates:  None
Place:  None

User Information
================================
Location:  Greece
Geo Enabled?  True
like image 76
Researcher Avatar answered Sep 23 '22 05:09

Researcher