I'm working on a Natural Language Processing (NLP) project in which I use a syntactic parser to create a syntactic parse tree out of a given sentence.
Example Input: I ran into Joe and Jill and then we went shopping
Example Output: [TOP [S [S [NP [PRP I]] [VP [VBD ran] [PP [IN into] [NP [NNP Joe] [CC and] [NNP Jill]]]]] [CC and] [S [ADVP [RB then]] [NP [PRP we]] [VP [VBD went] [NP [NN shopping]]]]]]
I'm looking for a C# utility that will let me do complex queries like:
Here's a Java utility that does this, I'm looking for a C# equivalent.
Any help would be much appreciated.
Senior Member. You read it as "ei-cee" (no "slash" pronounced). In terms of distinguishing between "air conditioning" and "air conditioner," I can think of an example like "Today, I bought a new air conditioner" ("conditioning" not allowed). I personally would not say "Today, I bought a new AC."
What A/C Means. The term “A/C” stands for “air conditioning,” but it's frequently used to describe any type of home cooling equipment, such as a traditional split-system air conditioner or heat pump, mini-split unit, geothermal system, or even a window unit.
He combined moisture with ventilation to "condition" and change the air in the factories, controlling the humidity so necessary in textile plants. Willis Carrier adopted the term and incorporated it into the name of his company. Domestic air conditioning soon took off.
In the air conditioning industry, the term HVAC is often used instead of AC. HVAC refers to heating, ventilation, and air conditioning, whereas AC simply refers to air conditioning. AC is generally used when referring to systems that are designed to cool the air in your home.
There are at least two NLP frameworks, i.e.
And here you can find instructions to use a java NLP in .NET:
This page is about using java OpenNLP, but could apply to the java library you've mentioned in your post
Or use NLTK following this guidelines:
We already use
One option would be to parse the output into C# code and then encoding it to XML making every node into string.Format("<{0}>", this.Name);
and string.Format("</{0}>", this._name);
in the middle put all the child nodes recursively.
After you do this, I would use a tool for querying XML/HTML to parse the tree. Thousands of people already use query selectors and jQuery to parse tree-like structure based on the relation between nodes. I think this is far superior to TRegex or other outdated and un-maintained java utilities.
For example, this is to answer your first example:
var xml = CQ.Create(d.ToXml());
//this can be simpler with CSS selectors but I chose Linq since you'll probably find it easier
//Find joe, in our case the node that has the text 'Joe'
var joe = xml["*"].First(x => x.InnerHTML.Equals("Joe"));
//Find the last (deepest) element that answers the critiria that it has "Joe" in it, and has a VBD in it
//in our case the VP
var closestToVbd = xml["*"].Last(x => x.Cq().Has(joe).Has("VBD").Any());
Console.WriteLine("Closest node to VPD:\n " +closestToVbd.OuterHTML);
//If we want the VBD itself we can just find the VBD in that element
Console.WriteLine("\n\n VBD itself is " + closestToVbd.Cq().Find("VBD")[0].OuterHTML);
Here is your second example
//Now for NP closest to 'Shopping', find the element with the text 'shopping' and find it's closest NP
var closest = xml["*"].First(x => x.InnerHTML.Equals("shopping")).Cq()
.Closest("NP")[0].OuterHTML;
Console.WriteLine("\n\n NP closest to shopping is: " + closest);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With