I have some vectors, from embedding faces, and I would like to store them in a database. I need to be able to find similar vectors from the database given a referenced embedded face.
I have tried using an array type in PostgreSQL, but there isn't any support for subtraction.
The specific problem is, suppose I have some vector data in a table
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
I want to figure out which one of these three vectors is closest (in Euclidean distance) to vector {5, 5, 5}.
The operations required are first to subtract two vectors, and then to find the length of the difference ||{5, 5, 5} - {4, 5, 6}||_2
In my scenario, a vector will have 128 dimensions.
It seems that you want to use PostGIS which is an easy extension of PostgreSQL which allows a whole bunch of geometric data type extensions. (point, vector, arc, etc.)
Since you want to search for vectors for embeddings and are asking for Euclidean distance, the proper PostgreSQL add-on for your use case is pgvector.
It supports the distance functions:
L2 distance is usually used for face recognition.
Cosine distance is suggested by OpenAI for their embeddings L2. However, it would yield the same result.
You can find installation instructions and references to libraries for most programming languages in the link above.
If you are interested in OpenAI embeddings (and Bing brought you here):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With