I have a materialized view on a PostgreSQL 9.3 database which seldom changes (about twice a day). But when it does, I'd like to update its data promptly.
Here is what I was thinking about so far:
There is a materialized view mat_view
which gets its data from tables table1
and table2
using some join statement.
Whenever something in table1
or table2
changes, I already have a trigger which updates a little configuration table config
consisting of
table_name | mat_view_name | need_update -----------+---------------+------------ table1 | mat_view | TRUE/FALSE table2 | mat_view | TRUE/FALSE
So if anything in table1
changes (there's a trigger on UPDATE and on DELETE for every statement), the field need_update
in the first row is set to TRUE
. The same goes for table2
and the second row.
Obviously, if need_update
is TRUE, then the materialized view must be refreshed.
UPDATE: Since materialized views do not support rules (as @pozs mentioned in a comment below), I would go one step further. I'd create a dummy view v_mat_view
with the definition "SELECT * FROM mat_view
". When the user does a SELECT on this view, I need to create a rule ON SELECT which does the following:
mat_view
should be updated (SELECT 1 FROM config WHERE mat_view_name='mat_view' AND need_update=TRUE
)need_update
flag with UPDATE config SET need_update=FALSE where mat_view_name='mat_view'
REFRESH MATERIALIZED VIEW mat_view
mat_view
as the target.UPDATE2: I tried creating the steps above:
Create a function that handles the four points mentioned above:
CREATE OR REPLACE FUNCTION mat_view_selector() RETURNS SETOF mat_view AS $body$ BEGIN -- here is checking whether to refresh the mat_view -- then return the select: RETURN QUERY SELECT * FROM mat_view; END; $body$ LANGUAGE plpgsql;
Create the view v_mat_view
which really selects from the function mat_view_selector
:
CREATE TABLE v_mat_view AS SELECT * from mat_view LIMIT 1; DELETE FROM v_mat_view; CREATE RULE "_RETURN" AS ON SELECT TO v_mat_view DO INSTEAD SELECT * FROM mat_view_selector(); -- this also converts the empty table 'v_mat_view' into a view.
# explain analyze select field1 from v_mat_view where field2 = 44; QUERY PLAN Function Scan on mat_view_selector (cost=0.25..12.75 rows=5 width=4) (actual time=15.457..18.048 rows=1 loops=1) Filter: (field2 = 44) Rows Removed by Filter: 20021 Total runtime: 31.753 ms
in comparison to selecting from the mat_view itself:
# explain analyze select field1 from mat_view where field2 = 44; QUERY PLAN Index Scan using mat_view_field2 on mat_view (cost=0.29..8.30 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=1) Index Cond: (field2 = 44) Total runtime: 0.036 ms
So essentially it DOES work, but performance might be an issue.
Anyone have better ideas? If not, then I would have to implement it somehow in the application logic or worse: run a simple cronjob that runs every minute or so.
You can set autorefresh for materialized views using CREATE MATERIALIZED VIEW. You can also use the AUTO REFRESH clause to refresh materialized views automatically.
To update the data in a materialized view, you can use the REFRESH MATERIALIZED VIEW statement at any time. When you use this statement, Amazon Redshift identifies changes that have taken place in the base table or tables, and then applies those changes to the materialized view.
If you want the materialized view to be refreshed automatically you should use ON COMMIT refresh method. Since you have specified an ON DEMAND refresh you will have to manually refresh the materialized view using DBMS_MVIEW. REFRESH method. There are lot of considerations for refreshing a materialized view.
You should refresh the view in triggers after insert/update/delete/truncate for each statement on table1
and table2
.
create or replace function refresh_mat_view() returns trigger language plpgsql as $$ begin refresh materialized view mat_view; return null; end $$; create trigger refresh_mat_view after insert or update or delete or truncate on table1 for each statement execute procedure refresh_mat_view(); create trigger refresh_mat_view after insert or update or delete or truncate on table2 for each statement execute procedure refresh_mat_view();
In this way your materialized view is always up to date. This simple solution might be hard to accept with frequent inserts/updates and sporadic selects. In your case (seldom changes about twice a day) it ideally fits your needs.
To realize deferred refresh of a materialized view you need one of the following features:
Postgres has none of them, so it seems that there is no clear postgres solution.
Taking this into account I would consider a wrapper function for selects on mat_view, e.g.
CREATE OR REPLACE FUNCTION select_from_mat_view(where_clause text) RETURNS SETOF mat_view AS $body$ BEGIN -- here is checking whether to refresh the mat_view -- then return the select: RETURN QUERY EXECUTE FORMAT ('SELECT * FROM mat_view %s', where_clause); END; $body$ LANGUAGE plpgsql;
If it is acceptable in practice depends on particulars I do not know about.
PostgreSQL 9.4 added REFRESH CONCURRENTLY
to Materialized Views.
This may be what you're looking for when you describe trying to setup an asynchronous update of the materialized view.
Users selecting from the materialized view will see incorrect data until the refresh finishes, but in many scenarios that use a materialized view, this is an acceptable tradeoff.
Use a statement level trigger that watches the underlying tables for any changes and then refreshes the materialized view concurrently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With