Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to manage row expiration in mysql

An application does the following:

  • writes a row to a table that has a unique ID
  • read the table and find the unique ID and output the other variables (among which the timestamp).

The question is: the application needs to read only the non-expired rows, which do expire every 2 minutes. There are a few alternatives to accomplish this: which has the best performance?

Consider that reading the expired rows doesn't matter as it will be done sporadically. The number of non expired rows will always be below a few thousands, but may expect just a few hundreds.

The cron job (run every minute) (or mysql event schedule) to do this can be one of the following (or any other idea?) based on the timestamp:

A) add a BOL variable to index the table and then read WHERE is not expired (based on the boll variable of course) B) add a BOL variable to partition the table and then read only the relevant partition (i am new to partitioning so I'm not sure how this may work out) C) read the whole table and delete each row that is expired and then write the same row to another table D) when writing, write two rows contemporarily in two tables and then delete the expired ones on in one table

OR

E) not use a cron job at all and check the timestamp on every read. (but why would I scan the whole table? Expired row are basically useless to the application itself)

Edit 1

Let me rephrase the question:

The objective is to retrieve all columns of a row only if the row was written less than 2 minutes ago.

The table has this structure, but can be entirely redefined

transactionID CHAR(8) NOT NULL, 
PRIMARY KEY(transactionID),
details VARCHAR(416),
tel INT(10),
time TIMESTAMP(),
address VARCHAR(60),
town VARCHAR(30),
flat VARCHAR (5),
supplier SMALLINT()

also supplier is a foreign key

Indexes are transactionID and eventually "status", an extra column, data type TO_BE_DEFINED_but_probably_SMALLINT_?()

Solution 1: use an indexed column which indicates the status of the row (expired, active, used)

Which is achieved running a cron job to change the field value from 1 (active) to 2 (expired) and the query would be the following:

$transaction = //a POST or GET 

$query = mysqli_query($con,"SELECT * FROM table WHERE transaction = '$transaction' AND status = 1");

if(mysql_num_rows($query)== 0){
   echo "Transaction expired";
}
else{
// I will show all the fields in the selected row;
}

mysqli_close($con);

Solution 2: use a partition based on the timestamp of each row

Which is achieved running a cron job to partition the column every 2 minutes, but then the query is probably faster:

$transaction = //a POST or GET 

$query = mysqli_query($con,"SELECT * FROM table WHERE transaction = '$transaction' AND PARTITION (active)");

if(mysql_num_rows($query)== 0){
   echo "Transaction expired";
}
else{
// I will show all the fields in the selected row;
}

mysqli_close($con);

THe cron job would then be similar to this

$date = date("Y-m-d H:i:s");
$time = strtotime($date);
$check = $time - (2);


$query = mysqli_query($con,"PARTITION BY RANGE (timestamp_field) 
(PARTITION active VALUES LESS THAN ($check)),
((PARTITION expired VALUES MORE THAN ($check));")

Solution 3: forget all of this and copy the rows with a timestamp older than 2 minutes to another table with a cron job

Simple to accomplish although the script would be probably heavy on the write side, altough write efficiency to the "expired" table is irrelevant, I want the "active" query to be fast.

Solution 3B: When adding a row, write it also on another table so that it's only a DELETE function that the cron job had to perform.

Solution 4: forget all about it and WHERE on the timestamp

$transaction = //a POST or GET 

$date = date("Y-m-d H:i:s");
$time = strtotime($date);
$check = $time - (2);

$query = mysqli_query($con,"SELECT * FROM table WHERE transaction = '$transaction' AND timestamp > $check");

if(mysql_num_rows($query)== 0){
   echo "Transaction expired";
}
else{
// I will show all the fields in the selected row;
}

mysqli_close($con);
like image 922
smartcity Avatar asked Dec 23 '13 13:12

smartcity


People also ask

Does MySQL support TTL?

Scylla, Cassandra, DynamoDB, MySQL, Oracle, PostgreSQL, and SQL Serve are all major platforms that support TTL, providing the functionality to delete expired data according to the TTL value automatically.

Do MySQL passwords expire?

MySQL enables database administrators to expire account passwords manually, and to establish a policy for automatic password expiration.

What is the limit of rows in MySQL?

The MySQL maximum row size limit of 65,535 bytes is demonstrated in the following InnoDB and MyISAM examples. The limit is enforced regardless of storage engine, even though the storage engine may be capable of supporting larger rows.


1 Answers

You didn't mention which version of MySQL you are using. As long as you have 5.1, you have event scheduler.

CREATE EVENT clearExpired
ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 2 MINUTES
DO
   UPDATE table SET status=0
    WHERE status <> 0
      AND TIMESTAMPDIFF(SECONDS, NOW(),table.timestamp)>120;

or

CREATE EVENT clearExpired
ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 2 MINUTES
DO
   DELETE FROM table
    WHERE TIMESTAMPDIFF(SECONDS, NOW(),table.timestamp)>120;

In this case, you also set a trigger that will copy all the deleted rows into a history table before deleting them.

For measuring efficiency, create a table events (columns: timestamp and action), and generate huge number of rows in your table, then change your event to:

...
DO
   BEGIN
      INSERT INTO events SET action="clearExpired start";
      <insert or delete>
      INSERT INTO events SET action="clearExpired end";
   END

Then you can see how long it takes one way vs. the other.

However, this event will run only once every minute or two, while there will be many more inserts. You want to optimize the actions that happen most frequently.

like image 63
Marjeta Avatar answered Sep 23 '22 11:09

Marjeta