REQUESTS: Return file object from url (as with open('','rb') )

Tags:

I want to download a file straight into memory using requests in order to pass it directly to PyPDF2 reader avoiding writing it to disk, but I can't figure out how to pass it as a file object. Here's what I've tried:

import requests as req
from PyPDF2 import PdfFileReader

r_file = req.get('http://www.location.come/somefile.pdf')
rs_file = req.get('http://www.location.come/somefile.pdf', stream=True)

with open('/location/somefile.pdf', 'wb') as f:
    for chunk in r_file.iter_content():
        f.write(chunk)

local_file = open('/location/somefile.pdf', 'rb')

#Works:
pdf = PdfFileReader(local_file)

#As expected, these don't work:
pdf = PdfFileReader(rs_file)
pdf = PdfFileReader(r_file)
pdf = PdfFileReader(rs_file.content)
pdf = PdfFileReader(r_file.content)
pdf = PdfFileReader(rs_file.raw)
pdf = PdfFileReader(r_file.raw)

380

asked May 05 '15 09:05

TimY

1 Answers

Without having to know anything about requests, you can always make a file-like object out of anything you have in memory as a string using StringIO.

In particular:

Python 2 StringIO.StringIO(s) is a binary file.
Python 2 cStringIO.StringIO(s) is the same, but possibly more efficient.
Python 3 io.BytesIO(b) is a binary file (constructed from bytes).
Python 3 io.StringIO(s) is a Unicode text file.
Python 2 io.BytesIO(s) is a binary file.
Python 2 io.StringIO(u) is a Unicode text file (constructed from unicode).

(The first two are "binary" in the Python 2 sense--no line-ending conversion. The others are "binary" vs. "text" in the Python 3 sense--bytes vs. Unicode.)

So, io.BytesIO(response.content) gives you a valid binary file-like object in both Python 2 and Python 3. If you only care about Python 2, cStringIO.StringIO(response.content) may be more efficient.

Of course "file-like" only goes so far; if the library tries to, e.g., call the fileno method and start making C calls against the file descriptor it isn't going to work. But 99% of the time, this works.

132

answered Sep 24 '22 13:09

abarnert

Related questions
                            
                                Running a Python script within shell script - Check status
                            
                                Python : UnicodeEncodeError when I use grep
                            
                                How do I get the most recent Cloudwatch metric data for an instance using Boto?
                            
                                Print chosen worksheets in excel files to pdf in python
                            
                                Python list equivalent in C++?
                            
                                Python: invalid literal for int() with base 10: '808.666666666667'
                            
                                ImportError: No module named gi.repository Mac OS X
                            
                                Why doesn't .rstrip('\n') work?
                            
                                Mask a circular sector in a numpy array
                            
                                Proximity Matrix in sklearn.ensemble.RandomForestClassifier
                            
                                how to plot arbitrary markers on a pandas data series?
                            
                                Django: Check for related objects and whether it contains data
                            
                                What does base value do in int function?
                            
                                How to sort integer list in python descending order
                            
                                Create superuser Django in PyCharm
                            
                                How to return indices of values between two numbers in numpy array
                            
                                Turning off Tick Marks in Bokeh
                            
                                Fit multivariate gaussian distribution to a given dataset
                            
                                How to close socket connection on Ctrl-C in a python programme
                            
                                TypeError: __init__() takes 1 positional argument but 3 were given

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

REQUESTS: Return file object from url (as with open('','rb') )

Tags:

python

file

download

python-requests

pypdf

TimY

People also ask

1 Answers

abarnert

Recent Activity

Donate For Us