I would like get meaning of selected word using wikionary API. Content retrieve data should be the same as is presented in "Word of the day", only the basic meaning without etympology, Synonyms etc.. for example
"postiche n Any item of false hair worn on the head or face, such as a false beard or wig."
I tried use documentation but i can find similar example, can anybody help with this problem?
The Wiktionary API can be used to query whether or not a word exists. The first link provides examples on other types of formats that might be easier to parse. These can then be parsed with any standard XML parser.
Wiktionary is not just for English. Wiktionary is a dictionary written in one language and covering all words in all languages, just as Wikipedia is an encyclopedia written in one language of all topics from all language-areas.
Wiktionary:Main Page It aims to describe all words of all languages using definitions and descriptions in English. Wiktionary has grown beyond a standard dictionary and now includes a thesaurus, a rhyme guide, phrase books, language statistics and extensive appendices.
Of the 1,269,938 definitions the English Wiktionary provides for 996,450 English words, 478,068 are "form of" definitions of this kind.
Although MediaWiki has an API (api.php
), it might be easiest for your purposes to just use the action=raw
parameter to index.php
if you just want to retrieve the source code of one revision (not wrapped in XML, JSON, etc., as opposed to the API).
For example, this is the raw word of the day page for November 14:
http://en.wiktionary.org/w/index.php?title=Wiktionary:Word_of_the_day/November_14&action=raw
What's unfortunate is that the format of wiki pages focuses on presentation (for the human reader) rather than on semantics (for the machine), so you should not be surprised that there is no "get word definition" API command. Instead, your script will have to make sense of the numerous text formatting templates that Wiktionary editors have created and used, as well as complex presentational formatting syntax, including headings, unordered lists, and others. For example, here is the source code for the page "overflow":
http://en.wiktionary.org/w/index.php?title=overflow&action=raw
There is a "generate XML parse tree" option in the API, but it doesn't break much of the presentational formatting into XML. Just see for yourself:
http://en.wiktionary.org/w/api.php?action=query&titles=overflow&prop=revisions&rvprop=content&rvgeneratexml=&format=jsonfm
In case you are wondering whether there exists a parser for MediaWiki-format pages other than MediaWiki, no, there isn't. At least not anything written in JavaScript that's currently maintained (see list of alternative parsers, and check the web sites of the two listed ones). And even then, supporting most/all of the common templates will be a big challenge. Good luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With