I'm trying to insert some files into a Postgres database. Since lots of duplication is expected, we put the files themselves into the file
table, then link them to the section of the database we're using with the output_file
table. As the file
table is also referenced by tables other than output_file
(for example, the similar input_file
table), one of its columns is a reference count, which is updated by a trigger when rows are inserted into output_file
(and the other tables, too, although they aren't being used at the times the problem occurs).
CREATE TABLE file
(
file_id serial PRIMARY KEY,
--other columns
occurences integer NOT NULL DEFAULT 0
);
CREATE TABLE output_file
(
output_file_id serial PRIMARY KEY,
--other columns
file_id integer REFERENCES file NOT NULL
);
CREATE OR REPLACE FUNCTION file_insert() RETURNS opaque AS '
BEGIN
UPDATE file
SET occurences = occurences + 1
WHERE file.file_id = NEW.file_id;
RETURN NEW;
END;
' LANGUAGE plpgsql;
CREATE TRIGGER output_file_insert AFTER INSERT
ON output_file FOR EACH ROW
EXECUTE PROCEDURE file_insert();
The code that inserts the files is shown below, and is all one transaction.
private void insertFiles(Set<File> files){
SortedSet<Integer> outputFileIDs = new TreeSet<Integer>();
PreparedStatement fileExistsStatement = getFileExistsStatement();
for(File file : files) {
try {
int fileID = -1;
ResultSet rs = /* Query to see if file already present in file table */
if(rs.next()) {
// File found
fileID = rs.getInt(1);
}
if(fileID < 0) {
/* File does not exist, upload it */
rs = /* Query to get file ID */
fileID = rs.getInt(1);
}
outputFileIDs.add(fileID);
}
catch(FileNotFoundException e){
/* handle errors */
}
}
Iterator<Integer> it = outputFileIDs.iterator();
while(it.hasNext()){
/* Insert reference in output file table */
PreparedStatement outputFileStatement = "INSERT INTO output_file (file_id, /*...*/) VALUES (?, /*...*/);";
outputFileStatement.setInt(1, it.next());
outputFileStatement.executeUpdate();
}
}
My problem is that code deadlocks (exception shown below) a lot. It'll chunter along fairly happily for a while, then deadlocks will start happening all over the place, to the extent that nothing makes it into the database at all when we roll back and retry. I'm mystified as to why it's deadlocking in the first place, though. The file IDs are stored in a sorted set, and so the locks on the file
table should be acquired in a consistent order for all transactions, as suggested in the Postgres manual, preventing any deadlock. What am I doing wrong? Does Postgres run its triggers in an undefined order?
org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 8949 waits for ShareLock on transaction 256629; blocked by process 8924.
Process 8924 waits for ExclusiveLock on tuple (4148,40) of relation 30265 of database 16384; blocked by process 8949.
Hint: See server log for query details.
Where: SQL statement "UPDATE file
SET occurences = occurences + 1
WHERE file.file_id = NEW.file_id"
PL/pgSQL function "file_insert" line 2 at SQL statement
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:105)
at [outputFileStatement.executeUpdate(), above]
[edit] As requested by axtavt, transactions are managed by the code which calls the method shown.
public void run(){
/* connection.setAutoCommit(false) has already been called elsewhere */
try{
boolean committed = false;
boolean deadlocked = false;
synchronized(connection){
do{
deadlocked = false;
try {
/* insert lots of other stuff */
if(!files.isEmpty()){
insertFiles(files);
}
/* insert some more stuff */
connection.commit();
committed = true;
closeStatements();
}
catch(PSQLException e){
if(e.getSQLState() != null){
if(e.getSQLState().equals("40P01")){
/* Log the fact that we're deadlocked */
deadlocked = true;
}
else{
throw e;
}
}
else{
throw e;
}
}
finally {
try {
if(!committed) {
connection.rollback();
}
}
catch (SQLException e) {
/* Log exceptions */
}
}
}while(deadlocked);
}
}
catch(Exception e){
/* Log exceptions */
}
finally{
try {
connection.close();
}
catch (SQLException e) {
/* Log exceptions */
}
}
}
Here's what is going on (I suspect):
A file tuple (let's say file.file_id = 1) already exists.
Process A inserts a new output_file with output_file.file_id=1. This acquires a sharelock on the file tuple with file.file_id = 1.
Process B inserts a new output_file with output_file.file_id=1. This acquires a aharelock on the file tuple with file.file_id = 1 (more than one transaction can hold a sharelock on a single tuple).
Process A then runs the trigger, which attempts to update the file tuple with file.file_id=1. It can't promote its sharelock to an exclusive lock, since another transaction (process B) is holding a sharelock. So, it waits.
Process B then runs its trigger, which attempts to update the file tuple with file.file_id=1. It can't promote its sharelock, for the same reason, so it waits... and then, we have a deadlock.
To fix, before you insert the new tuple, to a SELECT ... FOR UPDATE on the file tuple that will be the new parent; that will create the exclusive lock on that row, so the other transactions will queue up right from the start.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With