There are datasets with usual mail spam in the Internet, but I need datasets with fake reviews to conduct some research and I can't find any of them. Can anybody give me advices on where fake reviews datasets can be obtained?
Fake Review DetectionLexical features such as word n-grams, part-of-speech n-grams, and other lexical attributes. Content and style similarity of reviews from different reviewers. Semantic inconsistency (we have never used this kind of features).
This can be a real problem for brands that rely on third-party review sites, like Google Maps, to attract new customers. On Google, anyone can write a fake review that goes public instantly upon submission. And since Google is a third-party site, businesses can't just take the review down.
74% of consumers have read a fake review in the last year, though they're not always easy to spot. According to BrightLocal research, 82% of consumers have read a fake review in the last year.
Only just 3% to 10% of people actually write reviews. 61% of electronics reviews have been deemed “fake.” There were 2+ million unverified reviews on Amazon as of March 2019.
Our dataset is available on my Cornell homepage: http://www.cs.cornell.edu/~myleott/
A recent ACL paper, where the authors compiled such a data set:
Finding Deceptive Opinion Spam by Any Stretch of the Imagination
Myle Ott, Yejin Choi, Claire Cardie, Jeffrey T. Hancock
You might be able to find something in the references. Alternatively, you can mail the authors and check if the data are publicly available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With