I need to load a large number of csv files in to a PostgreSQL database. I have a table source_files which contains the file paths and a flag which indicates whether a file has already been loaded for all the csv files I need to load.
I have written the following code which loads the first file correctly but then throws the error:
ERROR: cursor "curs" does not exist
Why am I getting this error and how can I fix it?
DO $$
DECLARE
file_record record;
curs CURSOR
FOR SELECT id, file_path
FROM source_files
WHERE added_to_db=FALSE
ORDER BY id;
BEGIN
OPEN curs;
LOOP
-- Get next entry in source file which contains name of csv to load
FETCH curs INTO file_record;
exit WHEN NOT found;
BEGIN
-- As we need to add a column to the data after loading csv but before inserting
-- into final table we use a temporary table mytemp
DROP TABLE mytemp;
CREATE TABLE mytemp
(
dataA numeric,
dataB numeric
);
-- Load csv file
EXECUTE FORMAT('COPY mytemp
FROM ''%s''
DELIMITER '',''
CSV HEADER;', file_record.file_path);
-- Add Column specifying what source file the data is from
ALTER TABLE mytemp
ADD COLUMN source_id int;
UPDATE mytemp
SET source_id=file_record.id;
-- Add the data to the destination table
INSERT INTO data_table(
dataA,
dataB,
source_id
)
SELECT
mytemp.dataA,
mytemp.dataB
mytemp.source_id
FROM
mytemp
-- Set a flag to indicate that the current file in source_files has been loaded
UPDATE source_files
SET added_to_db=TRUE WHERE CURRENT OF curs;
COMMIT;
END;
END LOOP;
CLOSE curs;
END $$;
There big problem with your code is the COMMIT. You can use COMMIT in a DO statement, but the cursor is closed as soon as the transaction ends.
In SQL you can create a cursor WITH HOLD that remains valid after the transaction has ended, but that is not available in PL/pgSQL.
I suggest removing the COMMIT.
Another error in your code is your use of the format function, which exposes you to SQL injection. Instead of
FORMAT('COPY mytemp
FROM ''%s''
DELIMITER '',''
CSV HEADER;', file_record.file_path);
use
FORMAT('COPY mytemp
FROM %L
DELIMITER '',''
CSV HEADER;', file_record.file_path);
You can make the code much simpler by using an implicit cursor in the loop:
FOR file_record IN
SELECT id, file_path
FROM source_files
WHERE added_to_db=FALSE
ORDER BY id
LOOP
...
END LOOP;
That saves you declaring the cursor and the EXIT WHEN. The OPEN and CLOSE statements are unnecessary anyway.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With