I'm trying to stream response to csv.reader
using requests.get(url, stream=True)
to handle quite big data feeds. My code worked fine with python2.7
. Here's code:
response = requests.get(url, stream=True)
ret = csv.reader(response.iter_lines(decode_unicode=True), delimiter=delimiter, quotechar=quotechar,
dialect=csv.excel_tab)
for line in ret:
line.get('name')
Unfortunately after migration to python3.6 I got an following error:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
I was trying to find some wrapper/decorator that would covert result of response.iter_lines()
iterator from bytes to string, but no luck with that.
I already tried to use io
package and also codecs
. Using codecs.iterdecode
doesn't split data in lines, it's just split probably by chunk_size
,
and in this case csv.reader
is complaining in following way:
_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
I'm guessing you could wrap this in a genexp
and feed decoded lines to it:
from contextlib import closing
with closing(requests.get(url, stream=True)) as r:
f = (line.decode('utf-8') for line in r.iter_lines())
reader = csv.reader(f, delimiter=',', quotechar='"')
for row in reader:
print(row)
Using some sample data in 3.5
this shuts up csv.reader
, every line fed to it is first decoded
in the genexp. Also, I'm using closing
from contextlib
as is generally suggested to automatically close
the responce.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With