Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python async API requests in batches

I m tryin to make async API calls this way:

  1. func to send request:
async def get_data(client, postdata):
    res = await client.post(url=_url, headers=_headers, data=postdata)

    return res
  1. func to parse JSON:
async def parse_res(client, postdata):
    
    res = await get_data(client, postdata)
    
    if bool(json.loads(res.text)['suggestions']):
        _oks = <...grab some JSON fields...>
    else:
        _oks = {}
    return _oks
  1. I wrap this two funcs in MAIN():
async def main(_jobs):
    
    async with httpx.AsyncClient() as client:
        
        batch = []
        calls = []
    
        for job in _jobs:

            _postdata = '{ "query": "'+ job + '" }'

            calls.append(asyncio.create_task(parse_res(client, _postdata)))
            
        batch = await asyncio.gather(*calls)
            
        return batch

and then just run MAIN()

But the API can handle about 30-50 fast (nearly simultaneous requests or throws 429 HTTP error).

So i need to send batches of 30 calls and process 10 000 requests in chunks.

How do i process 10 000 (ten thousand) API calls in batches of 30 ?

like image 315
Ciro Avatar asked Oct 17 '25 11:10

Ciro


1 Answers

You could use Simon Hawe's answer, however here's a different approach without the usage of external libraries

Use asyncio.Semaphore to limit the amount of calls made concurrently, when the semaphore is released it will let another function to run.

import asyncio

sem = asyncio.Semaphore(30)  # no. of simultaneous requests

async def get_data(client, postdata):
    async with sem:
        res = client.post(url=_url, headers=_headers, data=postdata)
    return res


async def parse_res(client, postdata):
    res = await get_data(client, postdata)
    if bool(json.loads(res.text)['suggestions']):
        _oks = <...grab some JSON fields...>
    else:
        _oks = {}
    return _oks


async def main(_jobs: int):
    async with httpx.AsyncClient() as client:
        postdata = '{"query": "' + job + '"}'
        calls = [
            asyncio.create_task(parse_res(client, postdata)
            for _ in range(_jobs)
        ]

        return await asyncio.gather(*calls)
like image 164
Łukasz Kwieciński Avatar answered Oct 20 '25 00:10

Łukasz Kwieciński



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!