I think this is an interesting question, at least for me.
I have a list of words, let's say:
photo, free, search, image, css3, css, tutorials, webdesign, tutorial, google, china, censorship, politics, internet
and I have a list of contexts:
I need to try and match words with the appropriate context/contexts if possible.
Maybe discovering word relationships in some way.
Any ideas?
Help would be much appreciated!
This sounds like it's more of a categorization/ontology problem than NLP. Try WordNet for a standard ontology.
I don't see any real NLP in your stated problem, but if you do need some semantic analysis or a parser try NLTK.
Where do these words come from? Do they come from real texts. If they are then it is a classic data mining problem. What you need to do is to your set of documents into the matrix where rows represent which document the word came from and the columns represent the words in the documents.
For example if you have two documents like this:
D1: Need to find meaning. D2: Need to separate Apples from oranges
you matrix will look like this:
Need to find meaning Apples Oranges Separate From
D1: 1 1 1 1 0 0 0 0
D2: 1 1 0 0 1 1 1 1
This is called term by document matrix
Having collected this statistics you can use algorithms like K-Means to group similar documents together. Since you already know how many concepts you have your tasks should be soomewhat easier. K-Means is very slow algorithm, so you can try to optimize it using techniques such as SVD
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With