I am using Tweepy Library in Python to search for tweets. I am wondering, if I can use regular expression to search Tweets.
I am using the following code :
query = 'ARNOLD or SYLVESTER'
for tweet in tweepy.Cursor(api.search,
query,
count=100,
result_type="recent",
include_entities=True,
lang="en").items():
For instance, can I search for all tweets which uses 'ARNOLD' or 'SYLVESTER' ( all capital/single word) an ignore all the other tweets.
I am currently processing the tweets after obtaining all the tweets consisting of Arnold or Sylvester and then checking if all the characters are in uppercase. I am wondering if it can be done through API search itself.
Thanks
Twitter unfortunately doesn't support searching of tweets using regular expressions which means that you do have to post process. There's not actually any official documentation from Twitter to that effect, but everyone who uses the Twitter search API post-processes their tweets using regex (including me). Since there isn't a stated official position, I've tried just about every flavor of regex in search queries but I've had no luck. Per the Twitter search API documentation, queries must be:
A UTF-8, URL-encoded search query of 1,000 characters maximum, including operators. Queries may additionally be limited by complexity.
All queries are UTF-8 and are obviously searched as such. It'd be nice if there was a regex parameter we could specify in the API search call but there isn't.
The reason behind this is likely the additional processing cost that running a regex search on all tweets would have for Twitter itself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With