Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the last item in a JSON array in Postgres 9.4

We had a legacy system that attempted to keep track of all versions of data saved of a particular document. We originally stored our JSON as a string in some old version of Postgres, but recently we upgraded to Postgres 9.3 and we started using the JSON column type.

We had a column called "versions", and it held an an array, and each saved version of a particular document was stored in the array, so a query like this:

SELECT _data_as_json FROM measurements WHERE id = 3307551

returned JSON like this:

 {"reports": {}, "versions": [
 {"timestamp": "2014-04-28T19:12:31.567415", "user": 11327, "legacy": {}, "vd_version": 1}, 
 {"timestamp": "2014-05-12T18:03:24.417029", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-05-12T21:52:50.045758", "user": 10373, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-05-14T23:34:37.797822", "user": 10380, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-07-16T14:56:38.667363", "user": 10374, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-07-16T14:57:47.341541", "user": 10374, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-07-17T16:32:09.067026", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-09-11T14:35:44.436886", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-10-15T14:30:50.554932", "user": 10383, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-10-29T15:36:35.183787", "user": 11331, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}, 
 {"timestamp": "2014-11-12T22:22:03.892484", "user": 10373, "legacy": {"lengthmoment": {"moment": {"size": 130}, "length": {"in": 64.0}}, "comments": "", "custom": null}, "vd_version": 1}
 ]}     

We (attempted to) stored the data in "versions" in chronological order, but 99% of the time, we only need the last document. In Postgres 9.3, we came up with this query to get the last item:

SELECT json_array_elements(_data_as_json->'versions')
FROM measurements
WHERE id = 3307551
LIMIT 1 OFFSET (SELECT json_array_length(_data_as_json->'versions') - 1 FROM measurements WHERE id = 3307551)

This basically works, but it is a little fragile. If we ever fail to order things correctly in the versions array, then we get back the wrong version of the document. I'm curious if there is a better way to do this? I have read that Postgres 9.4 offers more functions for dealing with JSON.

Ideally, we could do ORDER BY on the "timestamp". Is that possible?

like image 765
cerhovice Avatar asked Dec 18 '14 17:12

cerhovice


1 Answers

Postgres 9.5+

The job is simple now since, quoting the manual:

The field/element/path extraction operators that accept integer JSON array subscripts all support negative subscripting from the end of arrays.

Bold emphasis mine. So for either json or jsonb:

SELECT data->'versions'->>-1
FROM   measurements m
WHERE  id = 3307551;

Postgres 9.4

You may want to use jsonb instead of json. Use jsonb_array_elements() or jsonb_array_length() accordingly.

There is a general approach to get the last element according to original sort order using WITH ORDINALITY (slightly slower without):

SELECT v.ver
FROM   measurements m
     , jsonb_array_elements(m.data->'versions') WITH ORDINALITY v(ver, ord)
WHERE  m.id = 3307551
ORDER  BY v.ord DESC
LIMIT  1;

Details for WITH ORDINALITY (and the implicit JOIN LATERAL in both versions):

  • PostgreSQL unnest() with element number

Postgres 9.3

"last" according to the timestamp value:

SELECT v.ver
FROM   measurements m
     , json_array_elements(m.data->'versions') v(ver)
WHERE  m.id = 3307551
ORDER  BY  (v.ver->>'timestamp')::timestamp DESC
LIMIT  1;

"last" according to ordinal position in the json array (faster):

SELECT data->'versions'->(json_array_length(data->'versions') - 1)
FROM   measurements
WHERE  id = 3307551;

We need - 1 because JSON arrays start at offset 0.

db<>fiddle here
Old sqlfiddle

like image 59
Erwin Brandstetter Avatar answered Oct 16 '22 16:10

Erwin Brandstetter