Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a stop word list for twitter?

I want to do some mining on tweets. Is there any more specific stop word list for tweets such as removing "lol" and other twitter smiley?

like image 670
陈家泽 Avatar asked Apr 30 '15 03:04

陈家泽


People also ask

Can you block certain keywords on Twitter?

Click More from the side navigation menu, then click Settings and privacy. Click the Privacy and safety tab, then click Mute and block. Click Muted words. Click the word or hashtag you'd like to edit or unmute.

Can I mute words on Twitter?

Mute words or phrases on an Android phoneSelect your icon on the upper left corner. Select “Settings and privacy” > “Privacy and safety” > “Mute and block.” Tap “Muted words.” Tap the plus sign and enter the word you want to mute.

How do I block content on Twitter?

With the Twitter mobile app open, tap on your profile photo in the upper left-hand corner. In the menu that appears, scroll toward the bottom of the list and tap on “Settings and privacy.” In the “Settings” menu, find and tap on “Privacy and Safety.” Tap on “Mute and block.”

How do I stop Suggested topics on Twitter?

To mute keywords on Twitter, simply press the “more” button on the Twitter website, select “Settings and Privacy,” head to the “Privacy and Safety” tab, and select “Mute and Block.” You can then choose which words you want to mute.


2 Answers

I guess you should merge ordinary stop word list, like this one or that, with the specific acronyms dictionary, e.g. this slang dictionary, or that, or that, or that (the last one seems to be the easiest for parsing, see comments here for the idea).

like image 179
Nikita Astrakhantsev Avatar answered Oct 12 '22 12:10

Nikita Astrakhantsev


I'm not aware of a specific stopwords list, but you could get a list of most frequent single words here: http://clic.cimec.unitn.it/amac/twitter_ngram/ (download en.1grams.gz)

To detect and then ignore smilies use: https://github.com/brendano/tweetmotif

You may also find these tools useful: https://github.com/willf/segment (if you want to segment hashtags) https://github.com/amacinho/Rovereto-Twitter-Tokenizer (if you don't)

like image 26
zelandiya Avatar answered Oct 12 '22 11:10

zelandiya