Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Would using partitions be a good idea in such a situation?

Context: Oracle 10 database.

In a rather large table (several million records) we recently started to see some performance troubles. The table has some special behaviours / conditions.

  • its mostly write once and then never gets changed again
  • during the first day or so the records are classified from 0..N (lets call that column class). records might get reclassified several times during that first day
  • new entries are added with class 0 meaning "not yet classified"
  • every hour or so a process classifies the new reocrds and gives them a new class from 1..N
  • all the readers are only interested in class 1
  • all records older than a day hardly change their class, > 1 is getting cleaned up a after a few day

Now, as most access is done to class 1, that column is often involved in queries (class = 1), together with other conditions. We have a index on the class column, and then again for certain other columns.

To my question: We are now thinking to partition that table by class. As far as I have understood this would make indexing/working with the data faster, as the class = 1 is already separated from the rest of the data and therefore access to it is implicitly more efficient. Is this correct?

If you agree that this is a good idea I will further read into the topic!

Thanks Cheers

Update 2010.11.30

Thank you very much for the input. I wasn't aware that its a extra option :) thanks for pointing that out (before I invest too much time into it). But beside the license issue, it appears to me as partition aren't necessarily a good solution in this context.

like image 239
reto Avatar asked Nov 29 '10 17:11

reto


1 Answers

What operations are experiencing slowness and have you been able to identify why those operations are slow?

If you partition by class, you will be slowing down the process of updating the class for a row. Since that would force a row to move from one partition to another, you'd be turning an update into a delete from the first partition and an insert into the second partition. If your hourly process is slow and it is slow because it takes time to find all the new records, the performance trade-off here may be quite reasonable. If your hourly process is slow because it takes time to compute what the new class should be and to update all the rows, on the other hand, that trade-off is probably a very poor idea.

Because partitioning is an extra cost option on top of the enterprise edition license, I would suggest making sure that you can't use some function-based indexes to get most of the performance improvements you're targeting at relatively little cost. If, for example, you had two function-based indexes

CREATE INDEX idx_new_entries
    ON your_table( (CASE WHEN class = 0 THEN primary_key ELSE null END) );

CREATE INDEX idx_class1_entries
    ON your_table( (CASE WHEN class = 1 THEN primary_key ELSE null END) );

along with a couple of views

CREATE VIEW vw_new_entries
AS
SELECT (CASE WHEN class = 0 THEN primary_key ELSE null END) primary_key,
       <<list of columns>>
  FROM your_table
 WHERE class = 0

CREATE VIEW vw_class1_entries
AS
SELECT (CASE WHEN class = 1 THEN primary_key ELSE null END) primary_key,
       <<list of columns>>
  FROM your_table
 WHERE class = 1

then any queries against the new views that filtered on the PRIMARY_KEY would use the function-based indexes which in turn would only index the appropriate rows in the underlying table. That may allow you to improve lookup performance without needing to resort to partitioning.

like image 52
Justin Cave Avatar answered Oct 16 '22 13:10

Justin Cave