Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GAE python threads not executing in parallel

I am trying to create a simple web app using Python on GAE. The app needs to spawn some threads per request received. For this I am using python's threading library. I spawn all the threads and then wait on them.

t1.start()
t2.start()
t3.start()

t1.join()
t2.join()
t3.join()

The application runs fine except for the fact that the threads are running serially rather than concurrently(confirmed this by printing the timestamps at the beginning/end of each thread's run() method). I have followed the instructions given in http://code.google.com/appengine/docs/python/python27/using27.html#Multithreading to enable multithreading

My app.yaml looks like:

application: myapp
version: 1
runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /favicon\.ico
  static_files: favicon.ico
  upload: favicon\.ico

- url: /stylesheet
  static_dir: stylesheet

- url: /javascript
  static_dir: javascript

- url: /pages
  static_dir: pages

- url: .*
  script: main.app

I made sure that my local GoogleAppLauncher uses python 2.7 by setting the path explicitly in the preferences.

My threads have an average run-time of 2-3 seconds in which they make a url open call and do some processing on the result.

Am I doing something wrong, or missing some configuration to enable multithreading?

like image 712
Nitesh Avatar asked Feb 19 '12 18:02

Nitesh


People also ask

Can Python run threads in parallel?

In fact, a Python process cannot run threads in parallel but it can run them concurrently through context switching during I/O bound operations. This limitation is actually enforced by GIL. The Python Global Interpreter Lock (GIL) prevents threads within the same process to be executed at the same time.

How many threads can run in parallel in Python?

Each CPU core can have up to two threads if your CPU has multi/hyper-threading enabled. You can search for your own CPU processor to find out more. For Mac users, you can find out from About > System Report. This means that my 6-Core i7 processor has 6 cores and can have up to 12 threads.

What is threading PY?

Threading in python is used to run multiple threads (tasks, function calls) at the same time. Note that this does not mean that they are executed on different CPUs. Python threads will NOT make your program faster if it already uses 100 % CPU time.


2 Answers

Are you experiencing this in the dev_appserver or after uploading your app to the production service? From your mention of GoogleAppLauncher it sounds like you may be seeing this in the dev_appserver; the dev_appserver does not emulate the threading behavior of the production servers, and you'd be surprised to find that it works just fine after you deploy your app. (If not, add a comment here.)

Another idea: if you are mostly waiting for the urlfetch, you can run many urlfetch calls in parallel by using the async interface to urlfetch: http://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html

This approach does not require threads. (It still doesn't properly parallelize the requests in the dev_appserver; but it does do things properly on the production servers.)

like image 152
Guido van Rossum Avatar answered Nov 06 '22 01:11

Guido van Rossum


The multithreading notes for GAE are merely for how requests are handled - they don't fundamentally change how Python threads work. Specifically, the "CPython Implementation Detail" note in the threading module docs still applies.

It's also worth mentioning the note in the "Sandboxing" section of the GAE docs:

Note that threads will be joined by the runtime when the request ends, so the threads cannot run past the end of the request.

like image 36
Nick Bastin Avatar answered Nov 06 '22 01:11

Nick Bastin