MySQL database table data purge/removal using MySQL Event Scheduler: 1. Deleting table data in batches/chunks 2. Logging each iteration 3. Handling & logging errors 4. Creating recurring event to cleanup/purge table data regularly

Recently, I was working on creating a utility to purge the table data of MySQL Database.  In this post, I will be sharing my experience of how we can create a recurring event in MySQL to purge/remove the table data.

Step I: Basic  Setup (Sample Database/Tables)

orders is the table for which we need to create purge job/utility.

CREATE TABLE `orders` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `OrderNumber` varchar(45) NOT NULL,
  `Total` double DEFAULT NULL,
  `Tax` double DEFAULT NULL,
  `Status` int(11) NOT NULL,
  `OrderDateUtc` datetime NOT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=112377 DEFAULT CHARSET=utf8;

Orders Table

*Note: Insert data in orders table

log table to store the purge job/utility logs

CREATE TABLE `log` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`Message` varchar(150) COLLATE utf8_bin NOT NULL,
`CreatedDateUtc` datetime NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=81 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

log table

Step II: Get minimum & maximum value of ‘ID’ (auto-increment column in orders table) satisfying Date range condition (based on DaysToRetainParam)

Here, DaysToRetainParam is input parameter & is used to specify duration in days for which we want to retain the data in the orders table.

DECLARE minID INT;
DECLARE maxID INT;
DECLARE createdDateUtcForIteration DATETIME;
SET createdDateUtcForIteration = UTC_TIMESTAMP();

SELECT 
    MIN(id), MAX(id)
    INTO minID, maxID
FROM
    sample.orders AS orders
WHERE
    OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
                INTERVAL DaysToRetainParam DAY);

Step III: Delete table data in batches/chunks (here chunk size is 1000) with a sleep of 1 second between batches.

DECLARE createdDateUtcForIteration DATETIME;
DECLARE rowsDeleted INT;
DECLARE rowCount INT;
DECLARE maxBatchID INT;
SET createdDateUtcForIteration = UTC_TIMESTAMP();
SET rowsDeleted = 0;


OrdersPurge: LOOP

  SET maxBatchID = 0;
    
  SELECT 
    id
    INTO maxBatchID
  FROM
    sample.orders AS orders
  WHERE
    id >= minID AND id <= maxID AND
    OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
                INTERVAL DaysToRetainParam DAY)	
  ORDER BY id
  LIMIT 1000, 1;
    
  IF maxID < minID OR minID is null OR maxBatchID = minID OR maxBatchID is null OR maxBatchID = 0 THEN
    -- SELECT minID, maxID, maxBatchID, 'exit';
         LEAVE OrdersPurge;  -- last chunk
  END IF;
    
  DELETE FROM sample.orders
  WHERE
    id >= minID AND id < maxBatchID
    AND OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
            INTERVAL DaysToRetainParam DAY);
                
  SELECT ROW_COUNT() INTO rowCount;
    -- SELECT rowCount;
    SET rowsDeleted = rowsDeleted + rowCount;
  SET minID = maxBatchID;
    
    -- Log message
  INSERT INTO sample.log
  (Message,
  CreatedDateUtc)
  VALUES
  (concat('Iteration: ' , rowCount, ' rows deleted.'),
  UTC_TIMESTAMP());

  DO SLEEP(1);
    
END LOOP OrdersPurge;

Step IV: Delete the last batch

IF minID is not null THEN
  DELETE FROM sample.orders
  WHERE
    id >= minID AND id <= maxID
    AND OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
        INTERVAL DaysToRetainParam DAY);

Step IV: Handle & Log Error

DECLARE errorCode CHAR(5) DEFAULT '00000';
DECLARE errorMessage TEXT;

DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN

  GET DIAGNOSTICS CONDITION 1 errorCode = RETURNED_SQLSTATE, errorMessage = MESSAGE_TEXT;
  
  INSERT INTO sample.log
  (Message,
  CreatedDateUtc)
  VALUES
  (CONCAT('Error = ',errorCode,', message = ',errorMessage),
  UTC_TIMESTAMP());
END;

Step V: Full Code of OrdersPurge Stored Procedure

CREATE DEFINER=`root`@`localhost` PROCEDURE `OrdersPurge`(
IN DaysToRetainParam int
)
BEGIN

DECLARE createdDateUtcForIteration DATETIME;
DECLARE rowsDeleted INT;
DECLARE rowCount INT;
DECLARE minID INT;
DECLARE maxID INT;
DECLARE maxBatchID INT;
DECLARE errorCode CHAR(5) DEFAULT '00000';
DECLARE errorMessage TEXT;

DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN

  GET DIAGNOSTICS CONDITION 1 errorCode = RETURNED_SQLSTATE, errorMessage = MESSAGE_TEXT;
  
  INSERT INTO sample.log
  (Message,
  CreatedDateUtc)
  VALUES
  (CONCAT('Error = ',errorCode,', message = ',errorMessage),
  UTC_TIMESTAMP());
END;


SET createdDateUtcForIteration = UTC_TIMESTAMP();

INSERT INTO sample.log
(Message,
CreatedDateUtc)
VALUES
('Started.',
UTC_TIMESTAMP());

SELECT 
    MIN(id), MAX(id)
    INTO minID, maxID
FROM
    sample.orders AS orders
WHERE
    OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
                INTERVAL DaysToRetainParam DAY);

SET rowsDeleted = 0;

OrdersPurge: LOOP

  SET maxBatchID = 0;
    
  SELECT 
    id
    INTO maxBatchID
  FROM
    sample.orders AS orders
  WHERE
    id >= minID AND id <= maxID AND
    OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
                INTERVAL DaysToRetainParam DAY)	
  ORDER BY id
  LIMIT 1000, 1;
    
  IF maxID < minID OR minID is null OR maxBatchID = minID OR maxBatchID is null OR maxBatchID = 0 THEN
    -- SELECT minID, maxID, maxBatchID, 'exit';
         LEAVE OrdersPurge;  -- last chunk
  END IF;
    
  DELETE FROM sample.orders
  WHERE
    id >= minID AND id < maxBatchID
    AND OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
            INTERVAL DaysToRetainParam DAY);
                
  SELECT ROW_COUNT() INTO rowCount;
    -- SELECT rowCount;
    SET rowsDeleted = rowsDeleted + rowCount;
  SET minID = maxBatchID;
    
    -- Log message
  INSERT INTO sample.log
  (Message,
  CreatedDateUtc)
  VALUES
  (concat('Iteration: ' , rowCount, ' rows deleted.'),
  UTC_TIMESTAMP());

  DO SLEEP(1);
    
END LOOP OrdersPurge;

IF minID is not null THEN
  DELETE FROM sample.orders
  WHERE
    id >= minID AND id <= maxID
    AND OrderDateUtc < DATE_SUB(createdDateUtcForIteration,
        INTERVAL DaysToRetainParam DAY);
  
    SELECT ROW_COUNT() INTO rowCount;
  -- SELECT rowCount;
  SET rowsDeleted = rowsDeleted + rowCount;
  -- SELECT rowsDeleted as TotalRowsDeleted;
    
    -- Log message
  INSERT INTO sample.log
  (Message,
  CreatedDateUtc)
  VALUES
  (concat('Iteration: ' , rowCount, ' rows deleted.'),
  UTC_TIMESTAMP());
    
END IF;

-- Log message
INSERT INTO sample.log
(Message,
CreatedDateUtc)
VALUES
(concat('Completed successfully. ', rowsDeleted, ' rows deleted.'),
UTC_TIMESTAMP());

END

Step VI: Check & Enable MySQL Event Scheduler

Check whether the MySQL Event Scheduler is enabled:

SHOW PROCESSLIST;

If it is enabled, it will be listed in the output.

Check Event Scheduler

If it is not enabled, then enable it by updating the MySQL my.ini or my.cnf file as shown below:

[mysqld]

# Event Scheduler
# OFF: The Event Scheduler is stopped.
# ON: The Event Scheduler is started; the event scheduler thread runs and executes all scheduled events.
# DISABLED: This value renders the Event Scheduler nonoperational.
event_scheduler=ON

Or,

SET GLOBAL event_scheduler = ON;

Step VII: Create a recurring event in MySQL

OrdersPurgeEvent Event:

  1. Starts at ‘2019-06-15 11:42:00’
  2. Run everyday at 11:42:00
  3. Calls the OrdersPurge Stored Procedure.

Note: Do change Start DateTime to any future value at which you want event to start.

DROP EVENT IF EXISTS OrdersPurgeEvent;
CREATE EVENT IF NOT EXISTS OrdersPurgeEvent
  ON SCHEDULE EVERY 1 DAY
  STARTS '2019-06-15 11:42:00'
  DO
    CALL sample.OrdersPurge(30);

Check Event:

SHOW EVENTS;

MySql Event

Step VIII: Table data purge job utility log of one the iteration

MySql Event Scheduler Iteration

That’s it.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *