Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calling wordnet from php (Wordnet class or API for PHP)

I am trying to write a program to find similarity between two documents, and since im using only english, I decided to use wordnet, but I cannot find a way to link the wordnet with php, I cannot find any wordnet api from php.

I saw in the forum some one said (Spudley) he called wordnet from php (using shell_exec() function), Thesaurus class or API for PHP [edited]

I would really like to know a method used or some example code, a tutorial perhaps to start using the wordnet with php.

many thanks

like image 306
prabhath014 Avatar asked Dec 27 '22 19:12

prabhath014


1 Answers

The PHP extension which is linked to from the WordNet site is very old and out of date -- it claims to work with PHP4, so I don't think it's been looked at in years.

There aren't any other APIs available for WordNet->PHP, so I rolled my own solution.

WordNet can be run from the command-line, so PHP's shell_exec() function can read the output.

If you run WordNet from the command-line (cd to Wordnet's directory, then just wn) without any parameters, it will show you a list of possible functions that Wordnet supports.

Still in the command-line, if you then try one/some of those functions, you'll see how Wordnet outputs its results. For example, if you want synonyms for the word 'star', you could try the -synsn function:

wn star -synsn

This will produce output that looks a bit like this:

Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun star

8 senses of star

Sense 1 star => celestial body, heavenly body

Sense 2 ace, adept, champion, sensation, maven, mavin, virtuoso, genius, hotshot, star, superstar, whiz, whizz, wizard, wiz => expert

Sense 3 star => celestial body, heavenly body

Sense 4 star => plane figure, two-dimensional figure

Sense 5 star, principal, lead => actor, histrion, player, thespian, role player

Sense 6 headliner, star => performer, performing artist

Sense 7 asterisk, star => character, grapheme, graphic symbol

Sense 8 star topology, star => topology, network topology

In PHP, you can read this same output using the shell_exec() function.

$result = shell_exec('/path/to/wn '.$word.' -synsn');

Now $result should contain the block of text quoted above.

At this point, you have to do some proper coding. You'll need to take that block of text and parse it for the data you want.

This is where it gets tricky. Because the data is presented in a format designed to be read by a human rather than by a program, it is tricky to parse accurately.

It is important to note that different search options present their output slightly differently. And, some of the results that are returned can be somewhat esoteric. I ended up writing a weighting system to score the results, but it was fairly specific to my needs, so you'll need to experiment with it to come up with your own system.

I hope that's enough help for you. :)

like image 72
Spudley Avatar answered Dec 31 '22 14:12

Spudley