Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore cascade on foreign key update?

To preface, I'm not very experienced with database design. I have a table of hashes and ids. When a group of new hashes are added, each row in the group gets the same id. If any hash within the new group already exists in the database, all hashes in the new group and existing group(s) get a new, shared id (effectively merging ids when hashes are repeated):

INSERT INTO hashes 
    (id, hash) 
VALUES 
    ($new_id, ...), ($new_id, ...)
ON DUPLICATE KEY UPDATE 
    repeat_count = repeat_count + 1;

INSERT INTO hashes_lookup SELECT DISTINCT id FROM hashes WHERE hash IN (...);
UPDATE hashes JOIN hashes_lookup USING (id) SET id = '$new_id';
TRUNCATE TABLE hashes_lookup;

Other tables reference these ids, so that if an id changes, foreign key constraints take care of updating the ids across tables. The issue here, however, is that I can't enforce uniqueness across any of the child tables. If I do, my queries fail with:

Foreign key constraint for table '...', record '...' would lead to a duplicate entry in table '...'

This error makes sense, given the following test case where id and value are a composite unique key:

id | value
---+-------
a  | 1
b  | 2
c  | 1

Then a gets changed to c:

id | value
---+-------
c  | 1
b  | 2
c  | 1

But c,1 already exists.

It would be ideal if there was an ON UPDATE IGNORE CASCADE option, so that if a duplicate row exists, any duplicating inserts are ignored. However, I'm pretty sure the real issue here is my database design, so I am open to any and all suggestions. My current solution is to not enforce uniqueness across child tables, which leads to a lot of redundant rows.

Edit:

CREATE TABLE `hashes` (
 `hash` char(64) NOT NULL,
 `id` varchar(128) NOT NULL,
 `repeat_count` int(11) NOT NULL DEFAULT '0',
 `insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 UNIQUE KEY `hash` (`hash`) USING BTREE,
 KEY `id` (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `emails` (
 `id` varchar(128) NOT NULL,
 `group_id` char(5) NOT NULL,
 `email` varchar(500) NOT NULL,
 KEY `index` (`id`) USING BTREE,
 UNIQUE KEY `id` (`id`,`group_id`,`email`(255)) USING BTREE,
 CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`id`) REFERENCES `hashes` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
like image 977
Charlie Avatar asked Jul 20 '16 16:07

Charlie


1 Answers

I think will be good to create table hash_group to store id of hash group:

CREATE TABLE `hash_group` (
 `id` BIGINT AUTO_INCREMENT NOT NULL,
 `group_name` varchar(128) NOT NULL,
 `insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 UNIQUE KEY `group_name` (`group_name`) USING BTREE,
 PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And change structure of existing tables:

CREATE TABLE `hashes` (
 `hash` char(64) NOT NULL,
 `hash_group_id` BIGINT NOT NULL,
 `repeat_count` int(11) NOT NULL DEFAULT '0',
 `insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 UNIQUE KEY `hash` (`hash`) USING BTREE,
 KEY `hashes_hash_group_id_index` (`hash_group_id`) USING BTREE,
 CONSTRAINT `hashes_hash_group_id_fk` FOREIGN KEY (`hash_group_id`) REFERENCES `hash_group` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE `emails` (
 `hash_group_id` BIGINT NOT NULL,
 `group_id` char(5) NOT NULL,
 `email` varchar(500) NOT NULL,
 KEY `emails_hash_group_id_index` (`hash_group_id`) USING BTREE,
 UNIQUE KEY `emails_unique` (`hash_group_id`,`group_id`,`email`(255)) USING BTREE,
 CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`hash_group_id`) REFERENCES `hash_group` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Also create trigger to update hash group if you need to do it:

DELIMITER $$
CREATE TRIGGER `update_hash_group_name` AFTER UPDATE ON `hashes`
FOR EACH ROW
BEGIN
    UPDATE `hash_group` 
    SET `group_name` = md5(now()) -- replace to you hash formula
    WHERE id = NEW.hash_group_id;
END;$$
DELIMITER ;

And create function for getting actual group id:

DROP FUNCTION IF EXISTS get_hash_group;

DELIMITER $$
CREATE FUNCTION get_hash_group(id INT) RETURNS INT
BEGIN
  IF (id IS NULL) THEN
    INSERT INTO `hash_group` (`group_name`) 
    VALUES (md5(now())); -- replace to you hash
    RETURN LAST_INSERT_ID();
  END IF;

  RETURN id;
END;$$
DELIMITER ;

Scenario:

Initial fill:

INSERT INTO `hash_group` (id, group_name) VALUES 
(1, 'test1'),
(2, 'test2'),
(3, 'test3');

INSERT INTO `hashes` (hash, hash_group_id) VALUES
('hash11', 1),
('hash12', 1),
('hash13', 1),
('hash2', 2),
('hash3', 3);

INSERT INTO `emails` (hash_group_id, group_id, email)
VALUES
(1, 'g1', 'example1@'),
(2, 'g1', 'example2@'),
(3, 'g1', 'example2@');

Updating of hash_group scenario:

START TRANSACTION;

-- Get @min_group_id - minimum group id (we will leave this id and delete other)

SELECT MIN(hash_group_id) INTO @min_group_id
FROM hashes 
WHERE hash IN ('hash11', 'hash12', 'hash2', 'hash15');

-- Replace other group ids in email table to @min_group_id

UPDATE `emails` 
SET `hash_group_id` = @min_group_id
WHERE `hash_group_id` IN (
  SELECT hash_group_id
  FROM hashes 
  WHERE @min_group_id IS NOT NULL
  AND hash IN ('hash11', 'hash12', 'hash2', 'hash15')
  -- Update only if we are gluy several hash_groups
  AND `hash_group_id` > @min_group_id
);

-- Delete other hash_groups and leave only group with @min_group_id

DELETE FROM `hash_group` WHERE `id` IN (
  SELECT hash_group_id
  FROM hashes 
  WHERE @min_group_id IS NOT NULL
  AND hash IN ('hash11', 'hash12', 'hash2', 'hash15')
  -- Delete only if we are gluy several hash_groups
  AND `hash_group_id` > @min_group_id
);

-- @group_id = existing hash_group.id or create new if @min_group_id is null (all inserted hashes are new)

SELECT get_hash_group(@min_group_id) INTO @group_id;

-- Now we can insert new hashes.

INSERT INTO `hashes` (hash, hash_group_id) VALUES
('hash11', @group_id),
('hash12', @group_id),
('hash2', @group_id),
('hash15', @group_id)
ON DUPLICATE KEY 
UPDATE repeat_count = repeat_count + 1;


COMMIT;
like image 66
Nick Avatar answered Oct 09 '22 17:10

Nick