Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Question on multi-probe Local Sensitive Hashing

sorry to be asking this kind noob question, but because I really need some guidance on how to use Multi probe LSH pretty urgently, so I did not do much research myself. I realize there is a lib call LSHKIT available that implemented that algorithm, but I have trouble trying to figure out how to use it. Right now, I have a few thousand feature vector 296 dimension, each representing an image. The vector is used to query an user input image, to retrieve the most similar image. The method I used to derive the distance between vector is euclidean distance.

I know this might be a rather noob question, but do you guys have knowledge on how should i implement multi probe LSH? I am really very grateful to any answer or response.

-- update --

Tried to create a model for my data with the provided tool fitdata, however it doesn't seem to take in my file. The format I used for the input is in this format,float size : 4, number of data : 20, dimension : 297, and my array of 297 dimenison float array. However it give me this error

gsl: init_source.c:29: ERROR: matrix dimension n1 must be positive integer
Default GSL error handler invoked.
Aborted

Do you guys have any idea how to create a input for fitdata?

-- update --

Sorry for the late update, after trying out lsh. You can use the text2bin to format the data for fitdata. The text file contain the feature vector of the image or audio file, with each row representating an vector. After which, use mplsh-tune to get the M and W parameter. To construct the index, you can use the scan tool to sample a set of required query and you can use mplsh-run to get the index. Right now i trying to figure out how to use the index and how to link the library into my coding. Any body have any idea on this?

like image 992
Yijinsei Avatar asked Apr 04 '10 15:04

Yijinsei


People also ask

What are the advantages of locally sensitive hashing?

Locality Sensitive Hashing (LSH) is one of the most popular techniques for finding approximate nearest neighbor searches in high-dimensional spaces. The main benefits of LSH are its sub-linear query performance and theoretical guarantees on the query accuracy.

Is MinHash locality sensitive hashing?

The MinHash scheme may be seen as an instance of locality sensitive hashing, a collection of techniques for using hash functions to map large sets of objects down to smaller hash values in such a way that, when two objects have a small distance from each other, their hash values are likely to be the same.

What is meant by locality sensitive hashing?

In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. (The number of buckets is much smaller than the universe of possible input items.)


1 Answers

Let me instead point you to spectral hashing which kicks LSH's butt big time. Bonus: They have matlab code on their website, which you can either use or verify your own implementation against. Also, it's much easier to implement.

like image 172
bayer Avatar answered Oct 05 '22 22:10

bayer