TLDR: If I want to save arrays of integers in a Postgres table, are there any pros or cons to using an array column (integer[]
) vs. using a JSON column (eg. does one perform better than the other)?
Backstory:
I'm using a PostgreSQL database, and Node/Knex to manage it. Knex doesn't have any way of directly defining a PostgreSQL integer[]
column type, so someone filed a Knex bug asking for it ... but one of the Knex devs closed the ticket, essentially saying that there was no need to support PostgreSQL array column types when anyone can instead use the JSON column type.
My question is, what downsides (if any) are there to using a JSON column type to hold a simple array of integers? Are there any benefits, such as improved performance, to using a true array column, or am I equally well off by just storing my arrays inside a JSON column?
EDIT: Just to be clear, all I'm looking for in an answer is either of the following:
A) an explanation of how JSON columns and integer[] columns in PostgreSQL work, including either how one is better than the other or how the two are (at least roughly) equal.
B) no explanation, but at least a reference to some benchmarks that show that one column type or the other performs better (or that the two are equal)
Don't use JSON for data that can easily be stored in database tables. Avoid large JSON objects if you want to modify individual attributes. Don't use JSON if you want to use attributes in complicated WHERE conditions. Avoid JSON if you want to join on some of the attributes.
In general, most applications should prefer to store JSON data as jsonb , unless there are quite specialized needs, such as legacy assumptions about ordering of object keys. RFC 7159 specifies that JSON strings should be encoded in UTF8.
Array Type. PostgreSQL gives the opportunity to define a column of a table as a variable length single or multidimensional array. Arrays of any built-in or user-defined base type, enum type, or composite type can be created. We will focus on one data type in particular, the Array of text, text[].
When you are considering portability (e.g. rewriting your system to work with other databses) then you must not use arrays. If you are sure you'll stick with Postgres, then you can safely use arrays where you find appropriate. They exist for a reason and are neither bad design nor non-compliant.
An int[]
is a lot more efficient in terms of storage it requires. Consider the following query which returns the size of an array with 500 elements
select pg_column_size(array_agg(i)) as array_size,
pg_column_size(jsonb_agg(i)) as jsonb_size,
pg_column_size(json_agg(i)) as json_size
from generate_series(1,500) i;
returns:
array_size | jsonb_size | json_size
-----------+------------+----------
2024 | 6008 | 2396
(I am quite surprised that the JSON value is so much smaller than the JSONB, but that's a different topic)
If you always use the array as a single value it does not really matter in terms of query performance But if you do need to look into the array and search for specific value(s), that will be a lot more efficient with a native array.
There are a lot more functions and operators available for native arrays than there are for JSON arrays. You can easily search for a single value in a JSON array, but searching for multiple values requires workarounds.
The following query demonstrates that:
with array_test (id, int_array, json_array) as (
values
(1, array[1,2,3], '[1,2,3]'::jsonb)
)
select id,
int_array @> array[1] as array_single,
json_array @> '1' json_single,
int_array @> array[1,2] as array_all,
json_array ?& array['1','2'] as json_all,
int_array && array[1,2] as array_any,
json_array ?| array['1','2'] as json_any
from array_test;
You can easily query an array if it contains one specific value. This also works for JSON arrays. Those are the expressions array_single
and json_single
. With a native array you could also use 1 = any(int_array)
instead.
But check if an array contains all values from a list, or any value from a list does not work with JSON arrays.
The above test query returns:
id | array_single | json_single | array_all | json_all | array_any | json_any
---+--------------+-------------+-----------+----------+-----------+---------
1 | true | true | true | false | true | false
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With