Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to embed Wiktionary for offline access in Android App?

I am currently developing an Android app which is a Dictionary, where I am fetching meanings online with Wiktionary API with this: [http://en.wiktionary.org/w/api.php?action=query&prop=revisions&titles=overflow&rvprop=content&format=jsonfm

But I want to download the Wiktionary database offline and embed it inside my Android App.

Here is the Wiktionary Database Download Page:
1. Wiktionary
2. Wikimedia Downloads

According to my research I found out that Wiktionary Offline Database is in XML and SQL. But these files are too big. Embedding these files would make the APK size huge.
So is there any solution to embed this easily in my App?

like image 597
zackygaurav Avatar asked Feb 04 '16 13:02

zackygaurav


2 Answers

The developer [ of English Dictionary - Offline ] claims that they are using Wiktionary. I am still wondering where did they get a Wiktionary Dump File >22 MB

I'm not being paid enough to tell you that.. (joke). Thing is you need to extract the dictionary entries from the XML files and once you get only those then the final content (text) file becomes smaller.

Alternatively...

You can try this TSV file (courtesy of: semisignal.com) which is a snapshot of November 2012 definitions. This contains most words your end-user checking English would need. The TSV is 54MB and is handled like a text file.

Try a definition : brushable -- TSV has below :(Compare to Wiktionary's entry for Brushable).

English brushable Adjective # Able to be [[brushed]]
English brushable Adjective # Able to be controlled by [[brushing]]


TIPS: For reducing filesize, you can trim off the starting "English" since you already know its all English definitions. Each trim will save you 7 bytes (multiply by total definitions).

  • Use a String.replace on "English " (with that space) to clear it.

  • Also replace "Adjective" "Verb" "Noun" with short codes that your App knows the meaning of and shows entry type in the User Interface. Code could be 1 meaning list entry as Adjective.

Your trimmed text file could like example below. Each double fullstop just means "next section of entry", so basically entry..type..definition where <xyz> is a link to another entry in the dictionary. 54 bytes of TSV entry now becomes 35 bytes for that one line.

brushable..1..Able to be <brushed>.

Save the final edited (reduced) text file. Embed that into your APK.

like image 163
VC.One Avatar answered Nov 06 '22 03:11

VC.One


I suggest implementing the online API access, so small app can be downloaded and used, plus add a button somewhere that downloads the offline part. Also check network connection, and if it's not wi-fi, warn the user so the mobile data plan will not be abused for downloading 100 MB dictionary.

like image 31
Ivan Marinov Avatar answered Nov 06 '22 03:11

Ivan Marinov