Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql performance comparison between arrays and joins

What are the performance implications in postgres of using an array to store values as compared to creating another table to store the values with a has-many relationship?

I have one table that needs to be able to store anywhere from about 1-100 different string values in either an array column or a separate table. These values will need to be frequently searched for exact matches, so lookup performance is critical. Would the array solution be faster, or would it be faster to use joins to lookup the values in the separate table?

like image 865
Mike Avatar asked May 22 '14 15:05

Mike


1 Answers

These values will need to be frequently searched

Searched how? This is crucial.

Prefix pattern match only? Infix/suffix pattern matches too? Fuzzy string search / similarity matching? Stubbing and normalization for root words, de-pluralization? Synonym search? Is the data character sequences or natural language text? One language, or multiple different languages?

Hand-waving around "searched" makes any answer that ignores that part pretty much invalid.

so lookup performance is critical. Would the array solution be faster, or would it be faster to use joins to lookup the values in the separate table?

Impossible to be strictly sure without proper info on the data you're searching.

Searching text fields is much more flexible, giving you many options you don't have with an array search. It also generally reduces the amount of data that must be read.

In general, I strongly second Clodaldo: Design it right. Optimize later, if you need to.

like image 65
Craig Ringer Avatar answered Sep 21 '22 10:09

Craig Ringer