Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using node.js and natural language processing to handle multiple word phrases

I'm using the very cool natural library for node.js.

I'm trying to train my classifier to match the phrase user experience. My issue is if I do something like this:

classifier.addDocument(['user experience'], 'ux');

It doesn't match 2 word phrases, I believe because it tokenizes the words. If I do something like this:

classifier.addDocument(['user', 'experience'], 'ux');

It works like I want it to, but my issue is, I don't want to just match on the word user because an article could mention include the word user multiple times and it would potentially have nothing to do with user experience, which would lead to inaccurate classifications. So, my question is how does one match 2 or more word phrases using NLP?

Thanks for you help in advance.

like image 471
imns Avatar asked Apr 19 '14 16:04

imns


People also ask

What is NLP in node JS?

"NLP. js" is a general natural language utility for nodejs. Currently supporting: Guess the language of a phrase. Fast levenshtein distance of two strings.

Can Python and node js work together?

js and Python finding out that Node. js is awesome for Web Development and Python for Data Sciences. Actually, we don't need to always stick with the same programming language as there are ways to use them both together. In this article, I will show you an example of how to use a Python script from the Node.

Why is node js better than other languages?

Cross Platform Usability The one point that helps differentiate Node. js from others is that it can be used across almost all operating systems and platforms known to technology. The programming language can create applications that work on Linus, Unix, Windows, Mac OS and others of the nature.

What does node allow us to do with one of the client side languages?

Node. js allows developers to write JavaScript code on both the server and client side. Compared to other languages, Node. js code execution is faster.


1 Answers

You should take a look at n-grams, specifically in this case it's called a bigram, a sequence of two tokens. https://github.com/NaturalNode/natural#bigrams

I haven't used that particular library (don't think nodejs is the best language for NLP, it's still in its early stage and I'd suggest you use a more mature library(NLTK)/language(python) for NLP. Though I guess it's fine just for testing or some small project).

Anyway, judging from the manual, you could maybe do something like

classifier.addDocument([['user', 'experience']], 'ux');

Add brackets for each sequence you wish to add together.

like image 161
Samir Alajmovic Avatar answered Oct 24 '22 02:10

Samir Alajmovic