Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB Capped collection equivalent in PostgreSQL

The very basics of MongoDB capped collections is that they let you set maximum size of the table and the system will clear old data when the size limit is reached.

Has anyone came up with the similar setup in PostgreSQL and used it in production?

like image 323
Uzbekjon Avatar asked Jun 05 '12 11:06

Uzbekjon


2 Answers

Here is a simple solution to this, for instances where the size of the data is not too large and the performance requirements are not too tough.

First, let’s create our table. We are going to have an id column, and a data column, but you can have whatever columns you need for your particular situation.

CREATE TABLE capped_collection (
  id integer NOT NULL,
  data text);

Now, we create a sequence for our primary key, set the MAXVALUE to be the desired size of our capped collection, and we use the CYCLE option, so that the sequence will count up to MAXVALUE and then restart again at 1.

CREATE SEQUENCE capped_collection_id_seq
  START WITH 1
  INCREMENT BY 1
  MINVALUE 1
  MAXVALUE 10000
  CACHE 1
  CYCLE
  OWNED BY capped_collection.id;

In the next step (not shown here), you should pre-seed the table with MAXVALUE records. We do this so that every id value has a row, and we don’t have to worry about deciding whether to do an INSERT or UPDATE operation, just to keep things simple and performant.

Now that our table is set up, whenever we want to INSERT a new row, we instead do an UPDATE like so

UPDATE capped_collection
SET data = 'your data here'
WHERE id = (SELECT nextval('capped_collection_id_seq'))

The nextval command gets us the next id, and because we specified CYCLE it will wrap around back to 1 once it reaches MAXVALUE. The end result is you will only ever have MAXVALUE rows in the table, with the oldest entries being aged out.

(from http://www.rigelgroupllc.com/blog/2014/10/14/capped-collections-in-postgres/)

like image 190
JohnTheThird Avatar answered Oct 22 '22 20:10

JohnTheThird


Haven't heard of anything like that. I guess you could simulate a capped table size or circular queue by using:

  • upserts
  • and a modulo operator on a sequence (See Insert, on duplicate update in PostgreSQL? for an explanation on how to do the merge/upsert)

Especially interesting (from my perspective) is bovine's response (ie look at the non-accepted answers).

With the introduction of a column to the table to represent the index into the "collection" and a sequence for the column, each upsert can be based on the condition:

where index = (sequence.nextval % max collection size)
like image 35
Glenn Avatar answered Oct 22 '22 22:10

Glenn