Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pointers on using celery with sorl-thumbnails with remote storages?

I'm surprised I don't see anything but "use celery" when searching for how to use celery tasks with sorl-thumbnails and S3.

The problem: using remote storages causes massive delays when generating thumbnails (think 100s+ for a page with many thumbnails) while the thumbnail engine downloads originals from remote storage, crunches them, then uploads back to s3.

Where is a good place to set up the celery task within sorl, and what should I call?

Any of your experiences / ideas would be greatly appreciated.

I will start digging around Sorl internals to find a more useful place to delay this task, but there are a few more things I'm curious about if this has been solved before.

  1. What image is returned immediately? Sorl must be told somehow that the image returned is not the real thumbnail. The cache must be invalidated when celery finishes the task.

  2. Handle multiple thumbnail generation requests cleanly (only need the first one for a given cache key)

For now, I've temporarily solved this by using an nginx reverse proxy cache that can serve hits while the backend spends time generating expensive pages (resizing huge PNGs on a huge product grid) but it's a very manual process.

like image 919
Yuji 'Tomita' Tomita Avatar asked May 03 '12 02:05

Yuji 'Tomita' Tomita


2 Answers

I think what you want to do is set THUMBNAIL_BACKEND to a custom class that overrides the _create_thumbnail method. Instead of generating the thumbnail in that function, kick of a celery task that calls _create_thumbnail with the same arguments as given to the function. The thumbnail won't be available during the request, but it will get generated in the background.

like image 60
jterrace Avatar answered Nov 17 '22 22:11

jterrace


As I understand Sorl works correctly with the S3 storage but it's very slow.

I believe that you know what image sizes do you need.

You should launch the celery task after the image was uploaded. In task you call to sorl.thumbnail.default.backend.get_thumbnail(file, geometry_string, **options)

Sorl will generate a thumbnail and upload it to S3. Next time you request an image from template it's already cached and served directly from Amazon's servers

a clean way to handle a placeholder thumbnail image while the image is being processed.

For this you will need to override the Sorl backend. Add new argument to get_thumbnail function, e.g. generate=False. When you will call this function from celery pass generate=True

And in function change it's logic, so if thumb is not present and generate is True you work just like the standard backend, but if generate is false you return your placeholder image with text like "We process your image now, come back later" and do not call backend._create_thumbnail. You can launch a task in this case, if you think that thumbnail can be accidentally deleted.

I hope this helps

like image 4
Igor Avatar answered Nov 18 '22 00:11

Igor