Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a C# utility for matching patterns in (syntactic parse) trees?

I'm working on a Natural Language Processing (NLP) project in which I use a syntactic parser to create a syntactic parse tree out of a given sentence.

Example Input: I ran into Joe and Jill and then we went shopping
Example Output: [TOP [S [S [NP [PRP I]] [VP [VBD ran] [PP [IN into] [NP [NNP Joe] [CC and] [NNP Jill]]]]] [CC and] [S [ADVP [RB then]] [NP [PRP we]] [VP [VBD went] [NP [NN shopping]]]]]] enter image description here

I'm looking for a C# utility that will let me do complex queries like:

  • Get the first VBD related to 'Joe'
  • Get the NP closest to 'Shopping'

Here's a Java utility that does this, I'm looking for a C# equivalent.
Any help would be much appreciated.

like image 421
Gilad Gat Avatar asked Feb 03 '13 15:02

Gilad Gat


People also ask

Why is there a slash between AC?

Senior Member. You read it as "ei-cee" (no "slash" pronounced). In terms of distinguishing between "air conditioning" and "air conditioner," I can think of an example like "Today, I bought a new air conditioner" ("conditioning" not allowed). I personally would not say "Today, I bought a new AC."

What is AC?

What A/C Means. The term “A/C” stands for “air conditioning,” but it's frequently used to describe any type of home cooling equipment, such as a traditional split-system air conditioner or heat pump, mini-split unit, geothermal system, or even a window unit.

Why is it called AC?

He combined moisture with ventilation to "condition" and change the air in the factories, controlling the humidity so necessary in textile plants. Willis Carrier adopted the term and incorporated it into the name of his company. Domestic air conditioning soon took off.

Does AC include heat?

In the air conditioning industry, the term HVAC is often used instead of AC. HVAC refers to heating, ventilation, and air conditioning, whereas AC simply refers to air conditioning. AC is generally used when referring to systems that are designed to cool the air in your home.


2 Answers

There are at least two NLP frameworks, i.e.

  • SharpNLP (NOTE: project inactive since 2006)
  • Proxem

And here you can find instructions to use a java NLP in .NET:

  • Using OpenNLP in .NET project

This page is about using java OpenNLP, but could apply to the java library you've mentioned in your post

Or use NLTK following this guidelines:

  • Open Source NLP in C# 3.5 using NLTK
like image 95
JotaBe Avatar answered Oct 06 '22 00:10

JotaBe


We already use

One option would be to parse the output into C# code and then encoding it to XML making every node into string.Format("<{0}>", this.Name); and string.Format("</{0}>", this._name); in the middle put all the child nodes recursively.

After you do this, I would use a tool for querying XML/HTML to parse the tree. Thousands of people already use query selectors and jQuery to parse tree-like structure based on the relation between nodes. I think this is far superior to TRegex or other outdated and un-maintained java utilities.

For example, this is to answer your first example:

var xml = CQ.Create(d.ToXml());
//this can be simpler with CSS selectors but I chose Linq since you'll probably find it easier
//Find joe, in our case the node that has the text 'Joe'
var joe = xml["*"].First(x => x.InnerHTML.Equals("Joe")); 
//Find the last (deepest) element that answers the critiria that it has "Joe" in it, and has a VBD in it
//in our case the VP
var closestToVbd = xml["*"].Last(x => x.Cq().Has(joe).Has("VBD").Any());
Console.WriteLine("Closest node to VPD:\n " +closestToVbd.OuterHTML);
//If we want the VBD itself we can just find the VBD in that element
Console.WriteLine("\n\n VBD itself is " + closestToVbd.Cq().Find("VBD")[0].OuterHTML);

Here is your second example

//Now for NP closest to 'Shopping', find the element with the text 'shopping' and find it's closest NP
var closest = xml["*"].First(x =>     x.InnerHTML.Equals("shopping")).Cq()
                      .Closest("NP")[0].OuterHTML;
Console.WriteLine("\n\n NP closest to shopping is: " + closest);
like image 37
Benjamin Gruenbaum Avatar answered Oct 05 '22 22:10

Benjamin Gruenbaum