Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can MySQL create new partitions from the event scheduler

I'm having a table looking something like this:

CREATE TABLE `Calls` (
  `calendar_id` int(11) NOT NULL,
  `db_date` timestamp NOT NULL,
  `cgn` varchar(32) DEFAULT NULL,
  `cpn` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`calendar_id`),
  KEY `db_date_idx` (`db_date`)
) 
 PARTITION BY RANGE (calendar_id)(
   PARTITION p20091024 VALUES LESS THAN (20091024) ,
   PARTITION p20091025 VALUES LESS THAN (20091025));

Can I somehow use the mysql scheduler to automatically add a new partition(2 days in advance) - I'm looking for an example that would, every day add a new partition - it'd run something like

alter table Calls add partition (partition p20091026 values less than(20091026));

Where p20091026/20091026 is constructed when the scheduled task run, deriving the value from now + 2 day. (Or am I better of scripting this through cron ?)

like image 579
nos Avatar asked Nov 23 '09 22:11

nos


People also ask

Can we create partition on existing table in MySQL?

This table can be partitioned by HASH , using the id column as the partitioning key, into 8 partitions by means of this statement: ALTER TABLE t1 PARTITION BY HASH(id) PARTITIONS 8; MySQL supports an ALGORITHM option with [SUB]PARTITION BY [LINEAR] KEY .

How do I partition a MySQL database?

We can create a partition in MySQL using the CREATE TABLE or ALTER TABLE statement. Below is the syntax of creating partition using CREATE TABLE command: CREATE TABLE [IF NOT EXISTS] table_name. (column_definitions)

Can we do partitioning in MySQL?

MySQL supports several types of partitioning as well as subpartitioning; see Section 22.2, “Partitioning Types”, and Section 22.2. 6, “Subpartitioning”. Section 22.3, “Partition Management”, covers methods of adding, removing, and altering partitions in existing partitioned tables.

How do I create a partition table in MySQL?

Here is the basic syntax for creating a table partitioned by RANGE COLUMNS : CREATE TABLE table_name PARTITIONED BY RANGE COLUMNS(column_list) ( PARTITION partition_name VALUES LESS THAN (value_list)[, PARTITION partition_name VALUES LESS THAN (value_list)][, ...] )


2 Answers

Yes, you can do this.

Note that the scheduler isn't active by default (see Event Scheduler Configuration), so it's not a zero-risk option. For example, if your operations team migrates your app to a new server, but forgets to enable the scheduler, your app will get hosed. There's also special privileges needed, which again may need to be set up on a new server.

My advice: first, create a stored procedure (see code sample below) which handles periodic partition maintenance: dropping old partitions if the table gets too big, and adding enough new partitions (e.g. 1 week) so that even if the maintenance proc isn't run for a while, your app won't die.

Then redundantly schedule calls to that stored proc. Use the MySQL scheduler, use a cron job, and use any other way you like. Then if one scheduler isn't working, the other can pick up the slack. If you design the sproc correctly, it should be cheap to execute a no-op if it doesn't need to do anything. You might even want to call it from your app, e.g. as the first statement when generating a long-running report, or as part of your daily ETL process (if you have one). My point is that the achilles heel of scheduled tasks is ensuring that the scheduler is actually working-- so think about redundancy here.

Just make sure not to schedule all the calls at the same time so they won't step on each other! :-)

Here's a code sample for what your maintenance proc could look like-- first it prunes old partitions, then adds new ones. I left error checking and preventing multiple simultaneous executions as an exerise for the reader.

DELIMITER $$

DROP PROCEDURE IF EXISTS `test`.`UpdatePartitions` $$
CREATE PROCEDURE `test`.`UpdatePartitions` ()
BEGIN

  DECLARE maxpart_date date;
  DECLARE partition_count int;
  DECLARE minpart date;
  DECLARE droppart_sql date;
  DECLARE newpart_date date;
  DECLARE newpart_sql varchar(500);

  SELECT COUNT(*)
    INTO partition_count
    FROM INFORMATION_SCHEMA.PARTITIONS
    WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  -- first, deal with pruning old partitions
  -- TODO: set your desired # of partitions below, or make it parameterizable
  WHILE (partition_count > 1000)
  DO

    -- optionally, do something here to deal with the parition you're dropping, e.g.
    -- copy the data into an archive table

     SELECT MIN(PARTITION_DESCRIPTION)
       INTO minpart
       FROM INFORMATION_SCHEMA.PARTITIONS
       WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

     SET @sql := CONCAT('ALTER TABLE Calls DROP PARTITION p'
                        , CAST((minpart+0) as char(8))
                        , ';');

     PREPARE stmt FROM @sql;
     EXECUTE stmt;
     DEALLOCATE PREPARE stmt;

    SELECT COUNT(*)
      INTO partition_count
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';


  END WHILE;

  SELECT MAX(PARTITION_DESCRIPTION)
    INTO maxpart_date
    FROM INFORMATION_SCHEMA.PARTITIONS
    WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  -- create enough partitions for at least the next week
  WHILE (maxpart_date < CURDATE() + INTERVAL 7 DAY)
  DO

    SET newpart_date := maxpart_date + INTERVAL 1 DAY;
    SET @sql := CONCAT('ALTER TABLE Calls ADD PARTITION (PARTITION p'
                        , CAST((newpart_date+0) as char(8))
                        , ' values less than('
                        , CAST((newpart_date+0) as char(8))
                        , '));');

    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    SELECT MAX(PARTITION_DESCRIPTION)
      INTO maxpart_date
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  END WHILE;

END $$

DELIMITER ;

BTW, partition maintenance (ensuring new partitions are created in advance, pruning old partitions, etc.) is, IMHO, critically important to automate. I've personally seen a large enterprise data warehouse go down for a day because a year's worth of partitions was cretaed initially but no one remembered to create more partitions once the next year came around. So it's very good you're thinking about automation here-- it bodes well for the project you're working on. :-)

like image 82
Justin Grant Avatar answered Sep 19 '22 11:09

Justin Grant


Excellent solution from Justin there. I took his code as the starting point for my current project and would like to mention a few things that came up while I was implementing it.

  1. The existing partition structure in the table you run this on should not include a MAXVALUE type partition - all partitions must be delimited by literal dates. This is because SELECT MAX(PARTITION_DESCRIPTION) will return 'MAXVALUE' which fails to be converted to a date in the next step. If you get odd message when calling the procedure saying something like: illegal mix of collations for '<', this could be the problem.

  2. It's a good idea to add: "AND TABLE_SCHEMA = 'dbname'" when selecting partition names from the INFORMATION_SCHEMA table, because while more than one partition can exist with the same name for the same table (in different databases), they are all listed in the INFORMATION_SCHEMA table together. Without the TABLE_SCHEMA specification your select eg. MAX(PARTITION_DESCRIPTION) will give you the max partition name among every existing partition for tables of that name in every database.

  3. Somewhere along the way I had problems with the ALTER TABLE xxx ADD PARTITION as it is in Justin's solution, I think it was that the same format for the partition name (yyyymmdd) was being used as the partition delimiter which expected yyyy-mm-dd (v5.6.2).

  4. The default behaviour is to only add partitions in the future as necessary. If you want to create partitions for the past, you will need to first set up a partition for a date older than the oldest partition you want. Eg. if you are keeping data for the past 30 days, first add a partition for say, 35 days ago and then run the procedure. Granted, it may only be feasible to do this on an empty table, but I thought it worth mentioning.

  5. In order to create the desired span of past/future partitions as in 4. you will initially need to run the procedure twice. For the example in 4. above, the first run will create partitions for -35 days to present, and the necessary future partitions. The second run will then trim the partitions between -35 and -30 away.

Here is what I am using at the moment. I added some parameters to make it a bit more flexible from the caller's point of view. You can specify the database, table, current date, and how many partitions to keep for both past and future.

I also altered the naming of partitions so that the partition named p20110527 represents the day starting from 2011-5-27 00:00 instead of the day ending at that time.

There is still no error checking or prevention of simultaneous execution :-)

DELIMITER $$

DROP PROCEDURE IF EXISTS UpdatePartitions $$

-- Procedure to delete old partitions and create new ones based on a given date.
-- partitions older than (today_date - days_past) will be dropped
-- enough new partitions will be made to cover until (today_date + days_future)
CREATE PROCEDURE UpdatePartitions (dbname TEXT, tblname TEXT, today_date DATE, days_past INT, days_future INT)
BEGIN

DECLARE maxpart_date date;
DECLARE partition_count int;
DECLARE minpart date;
DECLARE droppart_sql date;
DECLARE newpart_date date;
DECLARE newpart_sql varchar(500); 

SELECT COUNT(*)
INTO partition_count
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME=tblname
AND TABLE_SCHEMA=dbname;

-- SELECT partition_count;

-- first, deal with pruning old partitions
WHILE (partition_count > days_past + days_future)
DO
-- optionally, do something here to deal with the parition you're dropping, e.g.
-- copy the data into an archive table

 SELECT STR_TO_DATE(MIN(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
   INTO minpart
   FROM INFORMATION_SCHEMA.PARTITIONS
   WHERE TABLE_NAME=tblname
   AND TABLE_SCHEMA=dbname;

-- SELECT minpart;

 SET @sql := CONCAT('ALTER TABLE '
                    , tblname
                    , ' DROP PARTITION p'
                    , CAST(((minpart - INTERVAL 1 DAY)+0) as char(8))
                    , ';');

 -- SELECT @sql;
 PREPARE stmt FROM @sql;
 EXECUTE stmt;
 DEALLOCATE PREPARE stmt;

SELECT COUNT(*)
  INTO partition_count
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_NAME=tblname
  AND TABLE_SCHEMA=dbname;

-- SELECT partition_count;

END WHILE;

SELECT STR_TO_DATE(MAX(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
INTO maxpart_date
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME=tblname
AND TABLE_SCHEMA=dbname;

-- select maxpart_date;
-- create enough partitions for at least the next days_future days
WHILE (maxpart_date < today_date + INTERVAL days_future DAY)
DO

-- select 'here1';
SET newpart_date := maxpart_date + INTERVAL 1 DAY;
SET @sql := CONCAT('ALTER TABLE '
                    , tblname
                    , ' ADD PARTITION (PARTITION p'
                    , CAST(((newpart_date - INTERVAL 1 DAY)+0) as char(8))
                    , ' VALUES LESS THAN ('''
                    , newpart_date
                    , '''));');

-- SELECT @sql;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

SELECT STR_TO_DATE(MAX(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
  INTO maxpart_date
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_NAME=tblname
  AND TABLE_SCHEMA=dbname;

SET maxpart_date := newpart_date;

END WHILE;

END $$

DELIMITER ;
like image 39
iforce2d Avatar answered Sep 19 '22 11:09

iforce2d