I'm currently working on a project, where I want to extract emotion from text. As I'm using conceptnet5 (a semantic network), I can't however simply prefix words in a sentence that contains a negation-word, as those words would simply not show up in conceptnet5's API. Here's an example: <blockquote> The movie wasn't that good. </blockquote> Hence, I figured that I could use wordnet's lemma functionality to replace adjectives in sentences that contain negation-words like (not, ...). In the previous example, the algorithm would detect <code>wasn't</code> and would replace it with <code>was not</code>. Further, it would detect a negation-word <code>not</code>, and replace <code>good</code> with it's antonym <code>bad</code>. The sentence would read: <blockquote> The movie was that bad. </blockquote> While I see that this isn't the most elegant way, and it does probably in many cases produce the wrong result, I'd still like to handle negation that way as I frankly don't know any better approach. Considering my problem: Unfortunately, I did not find any library that would allow me to replace all occurrences of appended negation-words (<code>wasn't</code> => <code>was not</code>). I mean I could do it manually, by replacing the occurrences with a regex, but then I would be stuck with the english language. Therefore I'd like to ask if some of you know a library, function or better method that could help me here. Currently I'm using python <code>nltk</code>, still it doesn't seem that it contains such functionality, but I may be wrong. Thanks in advance :)

Cases like <code>wasn't</code> can be simply parsed by tokenization (<code>tokens = nltk.word_tokenize(sentence)</code>): <code>wasn't</code> will turn into <code>was</code> and <code>n't</code>. But negative meaning can also be formed by 'Quasi negative words, like hardly, barely, seldom' and 'Implied negatives, such as fail, prevent, reluctant, deny, absent', look into this paper. Even more detailed analysis can be found in Christopher Potts' <a href="https://web.stanford.edu/~cgpotts/papers/potts-salt20-negation.pdf" rel="noreferrer">On the negativity of negation </a>. Considering your initial problem, sentiment analysis, most modern approaches, as far as I know, don't process negations explicitly; instead, they use supervised approaches with high-order n-grams. Those actually processing negation usually append special prefix NOT_ to all words between negation and punctuation marks.

Negation handling in NLP

Tags:

python

regex

text-processing

nlp

nltk

I'm currently working on a project, where I want to extract emotion from text. As I'm using conceptnet5 (a semantic network), I can't however simply prefix words in a sentence that contains a negation-word, as those words would simply not show up in conceptnet5's API.

Here's an example:

The movie wasn't that good.

Hence, I figured that I could use wordnet's lemma functionality to replace adjectives in sentences that contain negation-words like (not, ...).

In the previous example, the algorithm would detect wasn't and would replace it with was not. Further, it would detect a negation-word not, and replace good with it's antonym bad. The sentence would read:

The movie was that bad.

While I see that this isn't the most elegant way, and it does probably in many cases produce the wrong result, I'd still like to handle negation that way as I frankly don't know any better approach.

Considering my problem: Unfortunately, I did not find any library that would allow me to replace all occurrences of appended negation-words (wasn't => was not). I mean I could do it manually, by replacing the occurrences with a regex, but then I would be stuck with the english language.

Therefore I'd like to ask if some of you know a library, function or better method that could help me here. Currently I'm using python nltk, still it doesn't seem that it contains such functionality, but I may be wrong.

Thanks in advance :)

585

asked Feb 25 '15 13:02

Tim Daubenschütz

1 Answers

Cases like wasn't can be simply parsed by tokenization (tokens = nltk.word_tokenize(sentence)): wasn't will turn into was and n't.

But negative meaning can also be formed by 'Quasi negative words, like hardly, barely, seldom' and 'Implied negatives, such as fail, prevent, reluctant, deny, absent', look into this paper. Even more detailed analysis can be found in Christopher Potts' On the negativity of negation .

Considering your initial problem, sentiment analysis, most modern approaches, as far as I know, don't process negations explicitly; instead, they use supervised approaches with high-order n-grams. Those actually processing negation usually append special prefix NOT_ to all words between negation and punctuation marks.

answered Oct 23 '22 02:10

Nikita Astrakhantsev

Related questions
                            
                                Why can't I establish connection to rabbitMQ using python?
                            
                                Add dynamic field to django admin model form
                            
                                Changing a single strings color within a QTextEdit
                            
                                SQLAlchemy one-to-one relation, primary as foreign key
                            
                                Exposing C++ functions, that return pointer using Boost.Python
                            
                                NumPy percentile function different from MATLAB's percentile function
                            
                                Sending an ASP.net POST with Python's Requests
                            
                                two Lists to Json Format in python
                            
                                Python cross correlation
                            
                                For loop in unittest
                            
                                How to install libpython2.7.so
                            
                                How to embed python in an Objective-C OS X application for plugins?
                            
                                plotting the projection of 3D plot in three planes using contours
                            
                                Average line for bar chart in matplotlib
                            
                                Sorted bar charts with pandas/matplotlib or seaborn
                            
                                Use first row as column names? Pandas read_html
                            
                                Python multiprocessing - tracking the process of pool.map operation
                            
                                Delete pdf files in folders and subfolders with python?
                            
                                Cython/Python/C++ - Inheritance: Passing Derived Class as Argument to Function expecting base class
                            
                                python dict implementation details [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With