I have an asynchronous function to get data from the site:
async def get_matches_info(url):
async with aiohttp.ClientSession() as session:
try:
async with session.get(url, proxy=proxy) as response:
...
...
...
...
except:
print('ERROR GET URL: ', url)
print(traceback.print_exc())
I have a list of about 200 links. Almost always everything is OK, but sometimes I get the following error:
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 924, in _wrap_create_connection
await self._loop.create_connection(*args, **kwargs))
File "C:\Python37\lib\asyncio\base_events.py", line 986, in create_connection
ssl_handshake_timeout=ssl_handshake_timeout)
File "C:\Python37\lib\asyncio\base_events.py", line 1014, in _create_connection_transport
await waiter
ConnectionResetError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "parser.py", line 90, in get_matches_info
async with session.get(url, proxy=proxy) as response:
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 1005, in __aenter__
self._resp = await self._coro
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 476, in _request
timeout=real_timeout
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 522, in connect
proto = await self._create_connection(req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 851, in _create_connection
req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 1085, in _create_proxy_connection
req=req)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 931, in _wrap_create_connection
raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host www.myscore.com.ua:443 ssl:None [None]
I checked all the links from the errors - they are working. Why can this happen?
This is probably the server limit of concurrent requests thinking you are doing a DoS attack. If you are in control of the server and it's running Apache you can raise that limit on the httpd conf for MaxKeepAliveRequests. If not, you can also set a limit to the amount of concurrent asyncio requests by using its semaphores. The example below sets that limit to 100 concurrent requests.
async def get_matches_info(url):
sem = asyncio.Semaphore(100)
async with sem:
async with aiohttp.ClientSession() as session:
try:
async with session.get(url, proxy=proxy) as response:
...
Note that if you call this function recursively that semaphore queue will be reset each time so you might want to consider placing this semaphore outside the function and pass it as a parameter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With