I am building an Android app which takes a string input and returns a ranked list of books using the Google API.
I am looking for a way to compare the open ended string that the user enters, with the first item in the list to see if what they entered is 'likely' to be one book. I have loads of information about the book, title, author, description etc so I can search in any part.
An example is:
'eyre affair fforde', 'fforde eyre affair', 'the eyre affair' ----> 'Likely' to be 'The Eyre Affair by Jasper Fforde'
What would be the best way to go about this? I have looked at levenshtein distance but don't think it would work with such open ended input, n-grams seem a good way to go, or fuzzy matching.
Any other ideas?
I would go with one of these:
SimMetrics (SimMetrics is an open source extensible library of Similarity or Distance Metrics, e.g. Levenshtein Distance, L2 Distance, Cosine Similarity, Jaccard Similarity etc etc.)
Commons Lang LevenshteinDistance
Or to get rid of hearing or spelling mistakes: soundex or metaphone.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With