Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to create a lock from a web application?

I've got a web application that re-sizes images. The re-sized images are written to disk in order to cache them. What is the best way to prevent multiple, simultaneous requests from generating the same image?

A couple things to note, we have millions of images (measured in terabytes). Cached images that haven't been viewed in a while are removed. We have a web farm, but each web server has it's own local cache (the originals are stored on another server). We also place the re-sized images in a second-tier cache once they are generated so other web servers can check there to see if the image is cached, if it is, it is copied local.

I've considered using locks (I posted a class that I'm considering using here). But that obviously won't work with the 2nd-tier cache and I'm not sure if it is a good idea in general on a web server to use locks (though I'm not sure why, just a bunch of vague references to it being a bad idea).

I've also considered writing a temp file that I could check before I start creating the image, but I'm concerned that Windows won't clean up the file properly 100% of the time (locking issues, etc).

Any ideas are appreciated.

like image 800
Brian Avatar asked Sep 13 '11 00:09

Brian


2 Answers

Did you consider using middleware for that, such as MSMQ or ActiveMQ? Once the image resize request to web server is submitted, it goes to the queue. A separate application would check the queue, resize the image and save it to cache.

like image 115
Sergey Sirotkin Avatar answered Sep 30 '22 22:09

Sergey Sirotkin


I would avoid locks if you can - especially since you don't need to lock here. You also want to avoid one machine locking based on another machines processing. If two machines create the same resized image, I assume they would be the same. So, if two machines happen to resize the same issue because they both missed the cache then its only slightly less efficient (wasted time) but very likely better than locking (and possibly deadlocking) and trying to optimize the edge case.

One option would be to create the resized image locally and enqueue the cached item into a central queue (database? in memory on central service?) either with the data or with a reference how to pull it from the front end machine. The centralized cache queue is processed serially. If two duplicates get put in the queue between the time it's resized by more than one machine and the queue item can get processed, it doesn't matter since processing the duplicate would simply condition pulling it since it's already on disk.

like image 35
bryanmac Avatar answered Sep 30 '22 23:09

bryanmac