I am exploring materialized views to create de-normalized view to avoid joining multiple tables for read performance. APIs will read from the materialized views to provide data to clients.
I am using amazon aurora postgres (version 11).
I am using a unique index on the materialized view (MV) so that I can use the “refresh concurrently” option.
What I am noticing though is that when only a fraction of the rows get updated in one of the source tables and I try to refresh the view, it's pretty slow. In fact slower than populating the view for the first time. e.g.: to populate MV first time takes ~30 mins, refresh is taking more than an hour. less than 1% of rows have been updated. The main three tables involved in generating the MV have ~18 million, 27 million & 40 million rows.
The timeliness of the materialized view refresh is important so that data is not stale for too long.
I could go with custom tables to store the denormalized data instead of materialized views but would have to implement logic to refresh data. So planning to avoid that if possible.
Is there anything that can be done to speed up the refresh process of the materialized views?
Please let me know if you need more details.
thanks Kiran
Refreshing a materialized view is slow even if little has changed, because every time the view is refreshed, the defining query is run.
Using CONCURRENTLY makes the operation even slower, because it is not a wholesale replacement of the materialized view contents, but modification of the existing data.
Perhaps you could create a denormalized table that is updated by a trigger whenever the underlying tables are modified.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With