In Python Celery, how do I persist objects across consecutive worker calls?

Tags:

I'm using Celery to automate some screen scraping. I'm using Selenium to open up a Chrome webdriver, manipulate the page, save some data, and then move on to the next page in the queue. The problem is that it builds up and breaks down the web driver for every task in the queue, which is very time consuming and resource intensive.

How do I persist objects across calls? I've read some things about connection pooling in Celery, but it's not clear to me how exactly this works - where do I build up the webdriver - in the tasks file or in the main queueing file? If the latter, how do the workers know which webdriver to use?

Example:

scrape.py:

for page in list:  
  scrape.delay(str(row['product_id']), str(row['pg_code']))

tasks.py:

def scrape:
  # do some stuff

579

asked Nov 05 '13 02:11

jwoww

1 Answers

Since each worker instantiates the task as a singleton, you can cache the web driver in the task object. The documentation specifically suggests this approach.

http://docs.celeryproject.org/en/latest/userguide/tasks.html#instantiation

167

answered Oct 13 '22 00:10

joshua

Related questions
                            
                                stacking sparse and dense matrices
                            
                                Communication between Python and Matlab
                            
                                Create constrained random numbers?
                            
                                How to remove a range of bytes from a bytes object in python?
                            
                                How to wait and get value of Span object in Selenium Python binding
                            
                                Is there a good way to download scipy, numpy, matplotlib, and pandas documentation for pylookup?
                            
                                How do I generate a spectrogram of a 1D signal in python?
                            
                                python re find string that may contain brackets
                            
                                Python using lambda to apply pd.DataFrame instead for nested loop is it possible?
                            
                                How to find possible English words in long random string?
                            
                                Does python support unicode beyond basic multilingual plane?
                            
                                Biopython SeqIO to Pandas Dataframe
                            
                                ZeroMQ: have to sleep before send
                            
                                How to handle "413: Request Entity Too Large" in python flask server
                            
                                Reference next item in list: python
                            
                                How do I build reusable widgets in jinja2?
                            
                                Does a function which takes iterable as parameter always accept iterator?
                            
                                How to run WindowCommand plugin from `sublime console`
                            
                                Socket.IO vs. Twisted [closed]
                            
                                Python: launch default mail client on the system

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In Python Celery, how do I persist objects across consecutive worker calls?

Tags:

python

celery

jwoww

People also ask

1 Answers

joshua

Recent Activity

Donate For Us