Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Refresh strategy for materialized views in a data warehouse

I have a system that has a materialized view that contains roughly 1 billion items, on a consistent two hour basis I need to update about 200 million (20% of the records). My question is what should the refresh strategy on my materialized view be? As of right now it is refresh with an interval. I am curious as to the performance impacts between refreshing on an interval vice refresh never and rename/replace the old materialized view with the new one. The underlying issue is the indices that are used by Oracle which creates a massive amount of redo. Any suggestions are appreciated.

UPDATE
Since some people seem to think this is off topic my current view point is to do the following:

Create an Oracle Schedule Chain that invokes a series of PL/SQL (programming language I promise) functions to refresh materialized view in a pseudo-parallel fashion. However, being as though I fell into the position of a DBA of sorts, I am looking to solve a data problem with an algorithm and/or some code.

like image 344
Woot4Moo Avatar asked Nov 04 '22 13:11

Woot4Moo


1 Answers

Ok so here is the solution I came up with, your mileage may vary and any feedback is appreciated after the fact. The overall strategy was to do the following:

1) Utilize the Oracle Scheduler making use of parallel execution of chains (jobs)
2) Utilize views (the regular kind) as the interface from the application into the database
3) Rely on materialized views to be built in the following manner

 create materialized view foo  
    parallel  
    nologging  
    never refresh  
    as  
    select statement

as needed use the following:

   create index baz on foo(bar) nologging

The advantage of this is that we can build the materialized view in the background before dropping + recreating the view as described in step 2. Now the advantage is creating dynamically named materialized views, while keeping the view with the same name. The key is to not blow away the original materialized view until the new one is finished. This also allows for quick drops, as there is minimum redo to care about. This enabled materialized view creation on ~1 billion records in 5 minutes which met our requirement of "refreshes" every thirty minutes. Further this is able to be handled on a single database node, so even with constrained hardware, it is possible.

Here is a PL/SQL function that will create it for you:

CREATE OR REPLACE procedure foo_bar as
foo_view varchar2(500) := 'foo_'|| to_char(sysdate,'dd_MON_yyyy_hh_mi_ss');
BEGIN
 execute immediate
 'Create materialized view '||  foo_view || '
  parallel
  nologging
  never refresh
  as
  select * from cats';
END foo_bar;
like image 65
Woot4Moo Avatar answered Nov 15 '22 04:11

Woot4Moo