Android & fuzzy matching, n-gram, and Levenshtein distance

Question

I am building an Android app which takes a string input and returns a ranked list of books using the Google API.

I am looking for a way to compare the open ended string that the user enters, with the first item in the list to see if what they entered is 'likely' to be one book. I have loads of information about the book, title, author, description etc so I can search in any part.

An example is:

'eyre affair fforde', 'fforde eyre affair', 'the eyre affair'
----> 
'Likely' to be 'The Eyre Affair by Jasper Fforde'

What would be the best way to go about this? I have looked at levenshtein distance but don't think it would work with such open ended input, n-grams seem a good way to go, or fuzzy matching.

Any other ideas?

Chris · Accepted Answer

I would go with one of these:

SimMetrics (SimMetrics is an open source extensible library of Similarity or Distance Metrics, e.g. Levenshtein Distance, L2 Distance, Cosine Similarity, Jaccard Similarity etc etc.)

Commons Lang LevenshteinDistance

Or to get rid of hearing or spelling mistakes: soundex or metaphone.

Android & fuzzy matching, n-gram, and Levenshtein distance

Tags:

java

android

levenshtein-distance

fuzzy-search

n-gram

Carrie Hall

1 Answers

Chris

Recent Activity

Donate For Us

Android & fuzzy matching, n-gram, and Levenshtein distance

Tags:

java

android

levenshtein-distance

fuzzy-search

n-gram

Carrie Hall

1 Answers

Chris

Related questions

Recent Activity

Donate For Us