Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tracking external files with SQL database and deleting an external file upon deletion of the record for it

I have no idea if I'm even going about this the right way or I'm going something completely stupidly.

I have a file system that will hold a bunch of image files. These are large map images of variable size. I am using my database to do spatial queries on them.

Basically all I want to do is be able to add an image's information (name, directory, and spatial info) to the database and delete an image from the database (the records in all tables along with the external file associated with that record). I know how to delete all the records but not the external data. I do not want to insert the images into the database as a binary blob because I will frequently be using external tools on the files.

Basically my database is only tracking the file's name and directory along with the spatial data associated with the file.

How do I delete a file from the file system when I delete the record from the database base?

Am am even going about this correctly? Is it more normal to insert the image's into the database as a binary blob? (The overhead of copying the data around makes this implausible for me and there must be a better way.)

I'm hoping it is irrelevant but I'm using postgre as my SQL database under Linux.

EDIT: My current strategy is use a shell script that handles the deletion of an image. During the shell script, it makes a transaction file to drop all the database records associated with the image while saving the file's full path to a flat text file. If the transaction was successful, I then delete the image in the flat file. Is this sensible? Is there a better way?

like image 551
SO Stinks Avatar asked Jun 13 '11 04:06

SO Stinks


People also ask

How do I delete a SQL database file?

Using SQL Server Management Studio Expand Databases, right-click the database from which to delete the file, and then click Properties. Select the Files page. In the Database files grid, select the file to delete and then click Remove. Click OK.

How can delete and UPDATE any record using SQL?

After creating and populating a table in SQL, the UPDATE command can be used to change the value of a column. If there is no WHERE , it will change the value of all the rows in the table, so make sure you restrict what it updates with a correct WHERE condition. The DELETE command can be used to delete rows.

How do I track changes in SQL database?

Right click on the table you want to track changes. Click Properties, click Change Tracking, then in the right pane set Change Tracking to TRUE.


3 Answers

A lot depends on WHERE you want to put your images.

Since a database usually needs fast random IO you'd put it on a box with some good battery backup RAID10 controller.

But a webserver serving zillions of mostly static (not often updated) files would need very different hardware, probably a RAID6, or a cloud of cheap servers.

So you have to take this into account in your design.

1) ON DELETE trigger

You could have the database delete the files via an ON DELETE trigger. Big Problem : if the transaction is rolled back, the files stay deleted !

2) log table

An ON DELETE trigger inserts deleted image records in a log table. A cron job reads this and deletes the files later.

==> No ROLLBACK problems

3) garbage collection

A cron job compares the list of files on disk and the contents of the database, and deletes on disk files with no matching database record.

This is safe but likely much slower than the log table !

4) do it in the application :

  • DELETE RETURNING returns a list of deleted records, COMMIT
  • delete from filesystem

Failure points :

  • you might get files without database records if your app dies, or the opposite if you put the COMMIT after the unlink() which would be worse.
  • same thing applies to INSERTs...
  • if something other than the application deletes from the database, it is not handled.
like image 188
bobflux Avatar answered Oct 08 '22 04:10

bobflux


Your "current strategy" sounds like the standard approach to me: delete from the database and if that succeeds (and that is a big "if"), delete the corresponding image file. You'd probably want a sanity checker to make sure you're not accumulating cruft though, just a simple comparison of the database and the file system to make sure they agree with each other.

You don't need to store the images in the database, the file system is quite good at handling files and it will probably be a lot more convenient to have them in the file system. And, as noted by David Ryder below, the file system will almost certainly be notably faster for dealing with large image files than the database: file systems are pretty good at dealing with files, that's what they do.


UPDATE: If you need this to be really quick then you could try doing the file deletions with a cron job. Once every couple hours (or day or whatever works), a cron job could compare the database with the file system and delete any stray images. This would make mass deletes from the database easier: you'd be able to do a DELETE FROM whatever WHERE ... to kill off multiple entries and then your janitor would come along later to clean up the leftover images.

like image 21
mu is too short Avatar answered Oct 08 '22 02:10

mu is too short


There is a PHP function to delete files from the file system:

unlink(filename)
like image 44
David Ryder Avatar answered Oct 08 '22 04:10

David Ryder