I am using Tweepy API for extracting Twitter feeds. I want to extract all Twitter feeds of a specific language only. The language filter works only if track
filter is provided. The following code returns 406 error:
l = StdOutListener() auth = OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) stream = Stream(auth, l) stream.filter(languages=["en"])
How can I extract all the tweets from certain language using Tweepy?
From the navigation menu, tap Settings and privacy. Tap Content preferences, and choose Recommendations from the Languages drop-down menu. Choose from any languages listed you'd like to see people, Trends, and Tweets in. Tap Done.
Click the Privacy and safety tab, then click Mute and block. Click Muted words. Click the plus icon. Enter the word or hashtag you'd like to mute.
You can't (without special access). Streaming all the tweets (unfiltered) requires a connection to the firehose, which is granted only in specific use cases by Twitter. Honestly, the firehose isn't really necessary--proper use of track
can get you more tweets than you know what to do with.
Try using something like this:
stream.filter(languages=["en"], track=["a", "the", "i", "you", "u"]) # etc
Filtering by words like that will get you many, many tweets. If you want real data for the most-used words, check out this article from Time: The 500 Most Frequently Used Words on Twitter. You can use up to 400 keywords, but that will likely approach the 1% limit of tweets at a given time interval. If your track
parameter matches 60% of all tweets at a given time, you will still only get 1% (which is a LOT of tweets).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With