Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

parallel post requests using multiprocessing and requests in Python

I have small code snippet as below:

import requests
import multiprocessing

header = {
'X-Location': 'UNKNOWN',
'X-AppVersion': '2.20.0',
'X-UniqueId': '2397123',
'X-User-Locale': 'en',
'X-Platform': 'Android',
'X-AppId': 'com.my_app',
'Accept-Language': 'en-ID',
'X-PushTokenType': 'GCM',
'X-DeviceToken': 'some_device_token'
}


BASE_URI = 'https://my_server.com/v2/customers/login'

def internet_resource_getter(post_data):
    stuff_got = []

    response = requests.post(BASE_URI, headers=header, json=post_data)
    stuff_got.append(response.json())

    return stuff_got

tokens = [{"my_token":'EAAOZAe8Q2rKYBAu0XETMiCZC0EYAddz4Muk6Luh300PGwGAMh26Bpw3AA6srcxbPWSTATpTLmvhzkUHuercNlZC1vDfL9Kmw3pyoQfpyP2t7NzPAOMCbmCAH6ftXe4bDc4dXgjizqnudfM0D346rrEQot5H0esW3RHGf8ZBRVfTtX8yR0NppfU5LfzNPqlAem9M5ZC8lbFlzKpZAZBOxsaz'},{"my_token":'EAAOZAe8Q2rKYBAKQetLqFwoTM2maZBOMUZA2w5mLmYQi1GpKFGZAxZCaRjv09IfAxxK1amZBE3ab25KzL4Bo9xvubiTkRriGhuivinYBkZAwQpnMZC99CR2FOqbNMmZBvLjZBW7xv6BwSTu3sledpLSGQvPIZBKmTv3930dBH8lazZCs3q0Q5i9CZC8mf8kYeamV9DED1nsg5PQZDZD'}]

pool = multiprocessing.Pool(processes=3)
pool_outputs = pool.map(internet_resource_getter, tokens)
pool.close()
pool.join()

All I am trying to do is fire parallel POST requests to the end point, while each POST would have a different token as it's post body.

  1. Will I be able to achieve what I want to with the above ? I get the output, but am not certain if my requests were sent parallelly or not.
  2. I am aware of grequests. I wanted to achieve true parallel requests (as in utilizing the multiple processors on my system) and hence I chose multiprocessing over grequests (which as far as I understand uses gevents, which are again not parallel, but multithreaded). Is my understanding correct here?
like image 907
qre0ct Avatar asked Apr 17 '17 08:04

qre0ct


2 Answers

If you interested in a parallel execution of multiple POST requests, I suggest you to use asyncio or aiohttp, which both implements the idea of asynchronous tasks, which run in parallel.

For example, you can do something like this with asyncio:

import requests
import asyncio

header = {
    'X-Location': 'UNKNOWN',
    'X-AppVersion': '2.20.0',
    'X-UniqueId': '2397123',
    'X-User-Locale': 'en',
    'X-Platform': 'Android',
    'X-AppId': 'com.my_app',
    'Accept-Language': 'en-ID',
    'X-PushTokenType': 'GCM',
    'X-DeviceToken': 'some_device_token'
}

BASE_URI = 'https://my_server.com/v2/customers/login'


def internet_resource_getter(post_data):
    stuff_got = []

    response = requests.post(BASE_URI, headers=header, json=post_data)

    stuff_got.append(response.json())
    print(stuff_got)
    return stuff_got

tokens = [
    {
        "my_token": 'EAAOZAe8Q2rKYBAu0XETMiCZC0EYAddz4Muk6Luh300PGwGAMh26B'
                    'pw3AA6srcxbPWSTATpTLmvhzkUHuercNlZC1vDfL9Kmw3pyoQfpyP'
                    '2t7NzPAOMCbmCAH6ftXe4bDc4dXgjizqnudfM0D346rrEQot5H0es'
                    'W3RHGf8ZBRVfTtX8yR0NppfU5LfzNPqlAem9M5ZC8lbFlzKpZAZBO'
                    'xsaz'
     },
    {
        "my_token": 'EAAOZAe8Q2rKYBAKQetLqFwoTM2maZBOMUZA2w5mLmYQi1GpKFGZAx'
                    'ZCaRjv09IfAxxK1amZBE3ab25KzL4Bo9xvubiTkRriGhuivinYBkZA'
                    'wQpnMZC99CR2FOqbNMmZBvLjZBW7xv6BwSTu3sledpLSGQvPIZBKmT'
                    'v3930dBH8lazZCs3q0Q5i9CZC8mf8kYeamV9DED1nsg5PQZDZD'
     }
]

loop = asyncio.get_event_loop()

for token in tokens:
    loop.run_in_executor(None, internet_resource_getter, token)

Note this: They only exist in python 3.x. But, it's look much better and concise in my opinion, and it's insure that they run in parallel.

like image 116
Yuval Pruss Avatar answered Sep 20 '22 18:09

Yuval Pruss


1) Yes, the above code will make requests for each token. One way to check if the requests were handled correctly is to check the return code:

for response in pool_outputs:
   if response.status_code != 200:
       raise Exception("{} - {}".format(response.status_code, response.text))

2) Yes, your understanding is sound. I too use a multiprocessing + requests combo instead of grequests.

Related:

When making requests parallel generally you don't need to focus on using multiple cores unless you are making millions of requests. This is because an HTTP request is 99% internet response time and 1% CPU-processing time. Your code will dispatch multiple requests at the same time which is what really matters. Additionally, you may want to look into the dreaded GlobalInterpreterLock to see if it affects your multi-core application: What is a global interpreter lock (GIL)?

like image 33
Matt Thomson Avatar answered Sep 18 '22 18:09

Matt Thomson