Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use a Materialized View to track latest versions of records

We've got a highly (perhaps over?) normalized table that keeps track of versioned values. It's insert only, no updates.

Example Data:

"ID"    "Version"   "Value"
1       0           "A_1"
2       0           "B_1"
1       1           "A_2"
3       0           "C_1"

We frequently run queries to pull only the latest values for each ID. As we're hitting millions of rows, we're starting to encounter performance problems. I've been able to prototype improvements using Materialized Views, but have not been able to create them in such a way that they self-refresh "ON COMMIT"

What I've got so far is this (Revised below)

CREATE MATERIALIZED VIEW TABLE_LATEST 
    BUILD IMMEDIATE
    REFRESH FAST 
    ON COMMIT AS

SELECT  T.ID
       ,T.LAST_VERSION
FROM (
    SELECT  ID
           ,MAX(VERSION) OVER (PARTITION BY ID) LAST_VERSION
    FROM    TABLE
) T
GROUP BY T.ID, T.LAST_VERSION;

Which is now revised, due to feedback:

CREATE MATERIALIZED VIEW TABLE_LATEST 
    BUILD IMMEDIATE
    REFRESH FAST 
    ON COMMIT AS

SELECT  ID
       ,MAX(VERSION)
FROM    TABLE
GROUP BY T.ID;

Which fails with:

ORA-12033: cannot use filter columns from materialized view log on "SCHEMA"."TABLE"

*Cause:    The materialized view log either did not have filter columns
           logged, or the timestamp associated with the filter columns was
           more recent than the last refresh time.

*Action:   A complete refresh is required before the next fast refresh.
           Add filter columns to the materialized view log, if required.

It will only 'work' if I change Refresh to Force and remove On Commit. I can't tell if this falls under the 'No Analytics' rule for Materialized Views or if perhaps I've created the Log incorrectly in the first place?

CREATE MATERIALIZED VIEW LOG ON TABLE
LOGGING 
WITH SEQUENCE, ROWID, (VALUE) 
INCLUDING NEW VALUES;

Table Schema:

CREATE TABLE "TABLE"
(
  ID NUMBER(10, 0) NOT NULL 
, VERSION NUMBER(10, 0) NOT NULL 
, VALUE VARCHAR2(4000 CHAR) 
, CONSTRAINT MASTERRECORDFIELDVALUES_PK PRIMARY KEY 
  (
    ID
  , VERSION 
  )
  USING INDEX 
  (
      CREATE UNIQUE INDEX TABLE_PK ON TABLE(ID ASC, VERSION ASC) 
      LOGGING 
      ...
  )
  ENABLE 
) 
LOGGING 

Am I even on the right track? Would there be a better performing way to pre-calculate latest versions? Or do I just need to get the Log & View settings dialed in?

like image 311
Tom Halladay Avatar asked Oct 30 '22 02:10

Tom Halladay


2 Answers

If you don't need the value associated with the latest version, then you can simply do:

CREATE MATERIALIZED VIEW LOG ON t1 
LOGGING  
WITH SEQUENCE, ROWID, (val) 
INCLUDING NEW VALUES;

create materialized view t1_latest
refresh fast on commit
as
  select id,
         max(version) latest_version
  from   t1
  group by id;

The test case for this can be found over at Oracle LiveSQL.


Otherwise, you need to create three separate MVs (because you can't have a fast refreshable on commit materialized view that involves keep dense_rank first/last) - as per http://www.sqlsnippets.com/en/topic-12926.html - like so:

Materialized view log on the main table:

CREATE MATERIALIZED VIEW LOG ON t1
LOGGING 
WITH SEQUENCE, ROWID, (val)
INCLUDING NEW VALUES;

First materialized view:

create materialized view t1_sub_mv1
refresh fast on commit
as
  select id,
         max(version) latest_version,
         count(version) cnt_version,
         count(*) cnt_all
  from   t1
  group by id;

Materialized view log on the first materialized view:

create materialized view log on t1_sub_mv1
with rowid, sequence (id, latest_version, cnt_version, cnt_all)
including new values;

Second materialized view:

create materialized view t1_sub_mv2
refresh fast on commit
as
  select id,
         version,
         max(val) max_val_per_id_version,
         count(*) cnt_all
  from   t1
  group by id,
           version;

Materialized view log on the first materialized view:

create materialized view log on t1_sub_mv2
with rowid, sequence (id, max_val_per_id_version, cnt_all)
including new values;

Third and final materialized view:

create materialized view t1_main_mv
refresh fast on commit
as
  select mv1.id,
         mv1.latest_version,
         mv2.max_val_per_id_version val_of_latest_version,
         mv1.rowid mv1_rowid,
         mv2.rowid mv2_rowid
  from   t1_sub_mv1 mv1,
         t1_sub_mv2 mv2
  where  mv1.id = mv2.id
  and    mv1.latest_version = mv2.version;

The supporting test case for this can be found over at Oracle LiveSQL.

like image 99
Boneist Avatar answered Nov 15 '22 08:11

Boneist


I am sorry but I can't give you an answer straight away. One reason might be the use of analytic function which are not that well supported by MVs. To analyze the problem you will need to take a look at the capabilities of the materialized view.

DECLARE
  -- Local variables here
  --
  v_sql VARCHAR2(32000) := 'SELECT  T.ID
                                   ,T.LAST_VERSION
                            FROM (SELECT  ID
                                         ,MAX(VERSION) OVER (PARTITION BY ID) LAST_VERSION    
                                  FROM    TABLE) T
                                  GROUP BY T.ID
                                          ,T.LAST_VERSION';

   v_msg_arrray SYS.EXPLAINMVARRAYTYPE;
   msg SYS.ExplainMVMessage;
BEGIN
  -- Test statements here
   dbms_mview.explain_mview(mv => v_sql, msg_array => v_msg_arrray);



  FOR i IN v_msg_arrray.FIRST..v_msg_arrray.LAST LOOP
    msg := v_msg_arrray(i);
    DBMS_OUTPUT.put_line('MVOWNER:' || msg.MVOWNER);
    DBMS_OUTPUT.put_line('MVNAME:' || msg.MVNAME);
    DBMS_OUTPUT.put_line('CAPABILITY_NAME:' || msg.CAPABILITY_NAME);
    DBMS_OUTPUT.put_line('POSSIBLE:' || msg.POSSIBLE);
    DBMS_OUTPUT.put_line('RELATED_TEXT:' || msg.RELATED_TEXT);
    DBMS_OUTPUT.put_line('RELATED_NUM:' || msg.RELATED_NUM);
    DBMS_OUTPUT.put_line('MSGNO:' || msg.MSGNO);    
    DBMS_OUTPUT.put_line('MSGTXT:' || msg.MSGTXT);
    DBMS_OUTPUT.put_line('SEQ:' || msg.SEQ);
    DBMS_OUTPUT.put_line('----------------------------------------');
  END LOOP;

END;

BTW: You can write your query far simpler:

SELECT t.id,
       MAX(t.version) AS last_version
FROM table t
GROUP BY t.id;
like image 44
fhossfel Avatar answered Nov 15 '22 10:11

fhossfel