I'm having difficult to understand the desired behavior of ON UPDATE CASCADE when one field is nullable and it's reference is updated.
In the following example of tables city and statistic, when we have a city and it's province has no value (NULL) the trigger ON UPDATE CASCADE does not work properly while I was expecting it to update the province value on statistics.
When we run a update on city with EXPLAIN we can easily see this behaivor.
Note: I'm using PostgreSQL 9.4.8
Creating the tables to reproduce the case
CREATE TABLE city (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(64) NOT NULL,
country_id BIGINT NOT NULL,
province_id BIGINT,
CONSTRAINT city_city_country_province_un UNIQUE (id, country_id, province_id)
);
CREATE TABLE statistics (
id BIGSERIAL PRIMARY KEY,
city_id BIGINT NOT NULL,
country_id BIGINT NOT NULL,
province_id BIGINT,
some_data INTEGER NOT NULL DEFAULT 0,
CONSTRAINT statistics_city_country_province_fk FOREIGN KEY (city_id, country_id, province_id) REFERENCES city (id, country_id, province_id) ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT statistics_city_un UNIQUE (city_id)
);
Populating tables
INSERT INTO city (name, country_id, province_id) VALUES ('SAO CARLOS', 1, 1);
INSERT INTO city (name, country_id, province_id) VALUES ('VATICAN CITY', 2, NULL);
INSERT INTO statistics (city_id, country_id, province_id) VALUES (1, 1, 1);
INSERT INTO statistics (city_id, country_id, province_id) VALUES (2, 1, NULL);
Updating city when province is NOT NULL
EXPLAIN ANALYZE VERBOSE UPDATE city SET province_id = 3 WHERE id = 1;
QUERY PLAN
Update on public.city (cost=0.15..8.17 rows=1 width=168) (actual time=0.238..0.238 rows=0 loops=1)
-> Index Scan using city_city_country_province_un on public.city (cost=0.15..8.17 rows=1 width=168) (actual time=0.046..0.048 rows=1 loops=1)
Output: id, name, country_id, 3::bigint, ctid
Index Cond: (city.id = 1)
Planning time: 0.510 ms
Trigger RI_ConstraintTrigger_a_41406 for constraint statistics_city_country_province_fk on city: time=0.792 calls=1
Trigger RI_ConstraintTrigger_c_41408 for constraint statistics_city_country_province_fk on statistics: time=0.296 calls=1
Execution time: 1.412 ms
city after the update
id | city_id | country_id | province_id | some_data
----+---------+------------+-------------+-----------
1 | 1 | 1 | 3 | 0
2 | 2 | 2 | | 0
statistics after the update
id | city_id | country_id | province_id | some_data
----+---------+------------+-------------+-----------
1 | 1 | 1 | 3 | 0
2 | 2 | 2 | | 0
Updating city when province is NULL
EXPLAIN ANALYZE VERBOSE UPDATE city SET province_id = 7 WHERE id = 2;
QUERY PLAN
Update on public.city (cost=0.15..8.17 rows=1 width=168) (actual time=0.170..0.170 rows=0 loops=1)
-> Index Scan using city_city_country_province_un on public.city (cost=0.15..8.17 rows=1 width=168) (actual time=0.042..0.044 rows=1 loops=1)
Output: id, name, country_id, 7::bigint, ctid
Index Cond: (city.id = 2)
Planning time: 0.423 ms
Execution time: 0.225 ms
city after the update
id | name | country_id | province_id
----+--------------+------------+-------------
1 | SAO CARLOS | 1 | 3
2 | VATICAN CITY | 2 | 7
statistics after the update
id | city_id | country_id | province_id | some_data
----+---------+------------+-------------+-----------
1 | 1 | 1 | 3 | 0
2 | 2 | 2 | | 0
EDIT: work arround to update when the field is NULL
Create a function to update when old value is NULL
CREATE OR REPLACE FUNCTION update_null_province_reference() RETURNS TRIGGER AS $$
BEGIN
IF (OLD.province_id IS NULL AND NEW.province_id IS NOT NULL) THEN
UPDATE statistics SET province_id = NEW.province_id WHERE city_id = NEW.id;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Create a trigger on city
CREATE TRIGGER update_null_province_reference
AFTER UPDATE ON city
FOR EACH ROW
EXECUTE PROCEDURE update_null_province_reference();
There is nothing unexpected happening here, and in fact the NULL
value is the reason.
When you execute the following UPDATE
statement:
UPDATE city SET province_id = 3 WHERE id = 1
Postgre detects that the (id, country_id, province_id)
combination has changed in the city
table, and then looks for the record with the same values in the statistics
table, in this case the tuple (1, 1, 1)
. This record is found, and so Postgres also updates the statistics
table.
However, when you execute this UPDATE
statement:
UPDATE city SET province_id = 7 WHERE id = 2
Postgres performs the UPDATE
on the record with the (id, country_id, province_id)
tuple having the values (2, 2, NULL)
. But Postgres cannot verify that a matching record exists in the statistics
table. The reason for this has to do with the meaning of NULL
. In relational algebra, NULL
means "not known." So Postgres sees a NULL
and gives up on the cascade because it cannot verify that the NULL
province_id
in the city
table in fact matches the NULL
value in the statistics
table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With