I am planning to do my final year project on Natural Language Processing (using NLTK) and my area of interest is Comment Summarization from Social media websites such as Facebook. For example, I am trying to do something like this:
Random Facebook comments in a picture :
Now, all these comments will get mapped (using a template based comment summarization technique) into something like this:
3 people find this picture to be "beautiful".
The ouput will consist of the word "beautiful" since it is more commonly used in the comments than the word "pretty" (and also the fact that Beautiful and pretty are synonyms).In order to accomplish this task, I am going to use approaches like tracking Keyword frequency and Keyword Scores (In this scenario,"Beautiful" and "Pretty" have a very close score). Is this the best way to do it?
So far with my research, I have been able to come up with the following papers but none of the papers address this kind of comment summarization :
What are the other papers in this field which address a similar issue?
Apart from this, I also want my summarizer to improve with every summarization task.How do I apply machine learning in this regard?
Topic model clustering is what you are looking for.
A search on Google Scholars for "topic model clustering will give you lots of references on topic model clustering.
To understand them, you need to be familiar with approaches for the following tasks, apart from basics of Machine Learning in general.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With