parallel post requests using multiprocessing and requests in Python

Question

I have small code snippet as below:

import requests
import multiprocessing

header = {
'X-Location': 'UNKNOWN',
'X-AppVersion': '2.20.0',
'X-UniqueId': '2397123',
'X-User-Locale': 'en',
'X-Platform': 'Android',
'X-AppId': 'com.my_app',
'Accept-Language': 'en-ID',
'X-PushTokenType': 'GCM',
'X-DeviceToken': 'some_device_token'
}


BASE_URI = 'https://my_server.com/v2/customers/login'

def internet_resource_getter(post_data):
    stuff_got = []

    response = requests.post(BASE_URI, headers=header, json=post_data)
    stuff_got.append(response.json())

    return stuff_got

tokens = [{"my_token":'EAAOZAe8Q2rKYBAu0XETMiCZC0EYAddz4Muk6Luh300PGwGAMh26Bpw3AA6srcxbPWSTATpTLmvhzkUHuercNlZC1vDfL9Kmw3pyoQfpyP2t7NzPAOMCbmCAH6ftXe4bDc4dXgjizqnudfM0D346rrEQot5H0esW3RHGf8ZBRVfTtX8yR0NppfU5LfzNPqlAem9M5ZC8lbFlzKpZAZBOxsaz'},{"my_token":'EAAOZAe8Q2rKYBAKQetLqFwoTM2maZBOMUZA2w5mLmYQi1GpKFGZAxZCaRjv09IfAxxK1amZBE3ab25KzL4Bo9xvubiTkRriGhuivinYBkZAwQpnMZC99CR2FOqbNMmZBvLjZBW7xv6BwSTu3sledpLSGQvPIZBKmTv3930dBH8lazZCs3q0Q5i9CZC8mf8kYeamV9DED1nsg5PQZDZD'}]

pool = multiprocessing.Pool(processes=3)
pool_outputs = pool.map(internet_resource_getter, tokens)
pool.close()
pool.join()

All I am trying to do is fire parallel POST requests to the end point, while each POST would have a different token as it's post body.

Will I be able to achieve what I want to with the above ? I get the output, but am not certain if my requests were sent parallelly or not.
I am aware of grequests. I wanted to achieve true parallel requests (as in utilizing the multiple processors on my system) and hence I chose multiprocessing over grequests (which as far as I understand uses gevents, which are again not parallel, but multithreaded). Is my understanding correct here?

Yuval Pruss · Accepted Answer

If you interested in a parallel execution of multiple POST requests, I suggest you to use asyncio or aiohttp, which both implements the idea of asynchronous tasks, which run in parallel.

For example, you can do something like this with asyncio:

import requests
import asyncio

header = {
    'X-Location': 'UNKNOWN',
    'X-AppVersion': '2.20.0',
    'X-UniqueId': '2397123',
    'X-User-Locale': 'en',
    'X-Platform': 'Android',
    'X-AppId': 'com.my_app',
    'Accept-Language': 'en-ID',
    'X-PushTokenType': 'GCM',
    'X-DeviceToken': 'some_device_token'
}

BASE_URI = 'https://my_server.com/v2/customers/login'


def internet_resource_getter(post_data):
    stuff_got = []

    response = requests.post(BASE_URI, headers=header, json=post_data)

    stuff_got.append(response.json())
    print(stuff_got)
    return stuff_got

tokens = [
    {
        "my_token": 'EAAOZAe8Q2rKYBAu0XETMiCZC0EYAddz4Muk6Luh300PGwGAMh26B'
                    'pw3AA6srcxbPWSTATpTLmvhzkUHuercNlZC1vDfL9Kmw3pyoQfpyP'
                    '2t7NzPAOMCbmCAH6ftXe4bDc4dXgjizqnudfM0D346rrEQot5H0es'
                    'W3RHGf8ZBRVfTtX8yR0NppfU5LfzNPqlAem9M5ZC8lbFlzKpZAZBO'
                    'xsaz'
     },
    {
        "my_token": 'EAAOZAe8Q2rKYBAKQetLqFwoTM2maZBOMUZA2w5mLmYQi1GpKFGZAx'
                    'ZCaRjv09IfAxxK1amZBE3ab25KzL4Bo9xvubiTkRriGhuivinYBkZA'
                    'wQpnMZC99CR2FOqbNMmZBvLjZBW7xv6BwSTu3sledpLSGQvPIZBKmT'
                    'v3930dBH8lazZCs3q0Q5i9CZC8mf8kYeamV9DED1nsg5PQZDZD'
     }
]

loop = asyncio.get_event_loop()

for token in tokens:
    loop.run_in_executor(None, internet_resource_getter, token)

Note this: They only exist in python 3.x. But, it's look much better and concise in my opinion, and it's insure that they run in parallel.

Matt Thomson · Answer

1) Yes, the above code will make requests for each token. One way to check if the requests were handled correctly is to check the return code:

for response in pool_outputs:
   if response.status_code != 200:
       raise Exception("{} - {}".format(response.status_code, response.text))

2) Yes, your understanding is sound. I too use a multiprocessing + requests combo instead of grequests.

Related:

When making requests parallel generally you don't need to focus on using multiple cores unless you are making millions of requests. This is because an HTTP request is 99% internet response time and 1% CPU-processing time. Your code will dispatch multiple requests at the same time which is what really matters. Additionally, you may want to look into the dreaded GlobalInterpreterLock to see if it affects your multi-core application: What is a global interpreter lock (GIL)?

parallel post requests using multiprocessing and requests in Python

Tags:

parallel-processing

multiprocessing

python-requests

python-2.7

grequests

qre0ct

2 Answers

Yuval Pruss

Matt Thomson

Recent Activity

Donate For Us

parallel post requests using multiprocessing and requests in Python

Tags:

parallel-processing

multiprocessing

python-requests

python-2.7

grequests

qre0ct

2 Answers

Yuval Pruss

Matt Thomson

Related questions

Recent Activity

Donate For Us