I looking for a algorithm, function or technique that can take a string and convert it to a number. I would like the algorithm or function to have the following properties:
Similar strings would yield similar values (similar can be defined as similar in meaning or similar in composition)
Capable of handling strings of variable length
I read an article several years ago that gives me hope that this can be achieved. Unfortunately, I have been unable to recall the source of the article.
Similar in composition is pretty easy, I'll let somebody else tackle that.
Similar in meaning is a lot harder, but fun :), I remember reading an article about how a neural network was trained to construct a 2D "semantic meaning graph" of a whole bunch of english words, where the distance between two words represented how "similar" they are in meaning, just by training it on wikipedia articles.
You could do the same thing, but make it one-dimensional, that will give you a single continuous number, where similar words will be close to each other.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With