I'm working with a process which is basically as follows:
Response
object from each.text
of each Response.From my understanding, this seems ideal for grequests:
GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.
But yet, the two processes (one with requests, one with grequests) seem to be getting me different results, with some of the requests in grequests returning None
rather than a response.
import requests
tickers = [
'A', 'AAL', 'AAP', 'AAPL', 'ABBV', 'ABC', 'ABT', 'ACN', 'ADBE', 'ADI',
'ADM', 'ADP', 'ADS', 'ADSK', 'AEE', 'AEP', 'AES', 'AET', 'AFL', 'AGN',
'AIG', 'AIV', 'AIZ', 'AJG', 'AKAM', 'ALB', 'ALGN', 'ALK', 'ALL', 'ALLE',
]
BASE = 'https://finance.google.com/finance?q={}'
rs = (requests.get(u) for u in [BASE.format(t) for t in tickers])
rs = list(rs)
rs
# [<Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# ...
# <Response [200]>]
# All are okay (status_code == 200)
# Restarted my interpreter and redefined `tickers` and `BASE`
import grequests
rs = (grequests.get(u) for u in [BASE.format(t) for t in tickers])
rs = grequests.map(rs)
rs
# [None,
# <Response [200]>,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# None,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>,
# <Response [200]>]
Why the difference in results?
Update: I can print the exception type as follows. Related discussion here but I have no idea what's going on.
def exception_handler(request, exception):
print(exception)
rs = grequests.map(rs, exception_handler=exception_handler)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
# ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
You are just sending requests too fast. As grequests
is an async lib, all of these requests are almost sent simultaneously. They are too many.
You just need to limit the concurrent tasks by grequests.map(rs, size=your_choice)
, I have tested grequests.map(rs, size=10)
and it works well.
I do not know the exact reason for the observed behavior with .map()
. However, using the .imap()
function with size=1
always returned a 'Response 200' for my few minutes testing. Here is the code snipet:
rs = (grequests.get(u) for u in [BASE.format(t) for t in tickers])
rsm_iterator = grequests.imap(rs, exception_handler=exception_handler, size=1)
rsm_list = [r for r in rsm_iterator]
print(rsm_list)
And if you don't want to wait for all requests to finish before working on their answers, you can do this like so:
rs = (grequests.get(u) for u in [BASE.format(t) for t in tickers])
rsm_iterator = grequests.imap(rs, exception_handler=exception_handler, size=1)
for r in rsm_iterator:
print(r)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With