Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I vary the sentence prefix "I am working on [X]" such that it has correct sentence structure for all X?

Tags:

nlp

api

I want the user to be able to enter a task and I will prefix it appropriately such that it has correct sentence structure.

E.g.

I am working on [making the world a better place]

...sounds good.

I am working on [discuss draft proposal]

...doesn't sound good. In this case it would want the program to respond with something like:

I am discussing a draft proposal

Basically the way people write tasks or todos appears to be imperative (e.g. pick up milk, write essay, etc.) or simply a noun (e.g. assignment 1, client meeting, etc.). I want to convert these to Present Progressive tense.

I am looking into the field of Natural Language Processing at the moment, but I was wondering if there was some sort of API available that would do what I need, or if someone has had experience with a similar problem.

like image 529
vaughan Avatar asked Jan 21 '12 11:01

vaughan


1 Answers

In addition to natural language processing, you're also asking about natural language generation: http://en.wikipedia.org/wiki/Natural_language_generation

You can try to use a parser (like the Stanford parser) to figure out which kind of phrase you have on hand and to identify the main verb if there is one. You might just fall back on a part-of-speech tagger for this. In English you'll also want to identify "helping" verbs (called "auxiliaries" in technical articles) like "will", "may", "can", etc. that often come right before the verb because these can change the tense as well.

If it's just a noun phrase, "I'm working on X" will likely sound okay. If it's a nominal, (if the Stanford parser gives you only NNs without any NPs or NNPs or DETs inside the top NP), then it might sound better with an article attached. E.g. "pepper project" -> "I'm working on the pepper project". You wouldn't do that for "Pepper's project" or if it's already "the pepper project", or for most proper nouns. There are always tricky cases though.

If it's a verb phrase: If it's already progressive, great. Else:

Use a lemmatizer (or fall back on a stemmer) to get the root form of the main verb.
Expand that root form into the present progressive. For this, probably a few heuristics will suffice, based on whether or not the lemma ends in a vowel or a consonant that gets doubled. E.g. "walk" -> "walking", "run" -> "running" (double n), "fly" -> "flying" (y doesn't behave like a vowel in this case), "glide" -> "gliding" (drop the last e after a consonant), but "flee" -> "fleeing" (not after a vowel). The most comprehensive place to look for regularities and exceptions is the Comprehensive Grammar of the English Language or a similar online resource. Tools for this include morphg and MorphAdorner.

Finally, remove any helper verbs and substitute the present progressive form for the main verb. While this won't be perfect, it'll probably look smarter than most.

If it's an entire clause (sentence-like thing with a subject too) or a question, or some other larger thing, you might cop out and just use a generic prefix, like "Right now: Has Jenn gotten back to me?" "Right now: I must head out!"

I'm not an expert, so I may have missed some tools already out there for this kind of thing, and if so, I hope to learn that from others. It's not an easy thing to do, but it sounds pretty useful. There will always be mistakes, and they might be jarring to your users, or perhaps they'll find the oddities endearing. If you put something together, will you post the API here?

like image 116
Gregory Marton Avatar answered Nov 03 '22 20:11

Gregory Marton