Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A "regex for words" (semantic replacement) - any example syntax and libraries?

I'm looking for syntatic examples or common techniques for doing regular expression style transformations on words instead of characters, given a procedural language.

For example, to trace copying, one would want to create a document with similar meaning but with different word choices.

I'd like to be able to concisely define these possible transformations that I can apply to a text stream.

Eg. "fast noun" to "rapid noun", but "go fast." wouldn't get transformed (no noun afterwards.
Or: "Alice will sing song" to "song will be sung by Alice"

I'd expect this to be done in grammatical checkers, such as detecting passive voice.

A C# implementation for this sort of language-processing would be really neat, but I think the bulk of any effort is coming up with the right rules - Keeping the rules clear and understandable seems like a place to begin.

like image 367
Procedural Throwback Avatar asked Oct 23 '08 05:10

Procedural Throwback


2 Answers

You could try Jason Rennie > WordNet-QueryData-1.47 > WordNet::QueryData

like image 172
bugmagnet Avatar answered Sep 24 '22 06:09

bugmagnet


One good place to start researching would be "Word Net" - it's a dictionary of semantics, grouping words together by similar meaning, and also recording the relationships between words in useful ways.

There are a bunch of software projects leveraging the Word Net corpus, one of them may be what you need.

like image 38
Bevan Avatar answered Sep 23 '22 06:09

Bevan