Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to build and label a non english dataset for sentiment analysis

lately I've started a new project about sentiment analysis and I should build a dataset in Persian language. while building a dataset is important for accuracy of whole process ,I want to do it as good as it's possible in shortest time. What is the most optimized way to build and label a sentiment analysis dataset?

like image 510
kosar_afr Avatar asked Sep 18 '25 07:09

kosar_afr


1 Answers

You can use available dataset as a reference of yours. There are many sources to get sentiment analysis dataset:

google

sananalytics

kaggle

stanford

Here is a list of datasets that give the sentiments for individual words.

positivewordsresearch

I suggest to you that work on mentioned datasets in order to increase your knowledge about dataset and their labels.

Generally sentiment datasets uses limited labels such as "positive/negative" or "happy", "sad", "angry", and "neutral" or "anger", "sadness", "surprise", "fear", "disgust", and "joy"

Hope to be useful for you.

like image 135
majid ghafouri Avatar answered Sep 19 '25 23:09

majid ghafouri