I want to download a file straight into memory using requests
in order to pass it directly to PyPDF2
reader avoiding writing it to disk, but I can't figure out how to pass it as a file object
. Here's what I've tried:
import requests as req
from PyPDF2 import PdfFileReader
r_file = req.get('http://www.location.come/somefile.pdf')
rs_file = req.get('http://www.location.come/somefile.pdf', stream=True)
with open('/location/somefile.pdf', 'wb') as f:
for chunk in r_file.iter_content():
f.write(chunk)
local_file = open('/location/somefile.pdf', 'rb')
#Works:
pdf = PdfFileReader(local_file)
#As expected, these don't work:
pdf = PdfFileReader(rs_file)
pdf = PdfFileReader(r_file)
pdf = PdfFileReader(rs_file.content)
pdf = PdfFileReader(r_file.content)
pdf = PdfFileReader(rs_file.raw)
pdf = PdfFileReader(r_file.raw)
Go to that directory and create a temporary http server in terminal/cmd as per your OS using command python -m http. server 8000 (Note 8000 is port no.) Open your desired file in browser and copy the link to your url.
Writing response to file When writing responses to file you need to use the open function with the appropriate file write mode. For text responses you need to use "w" - plain write mode. For binary responses you need to use "wb" - binary write mode.
To post HTML form data to the server in URL-encoded format using Python, you need to make an HTTP POST request to the server and provide the HTML form data in the body of the Python POST message. You also need to specify the data type using the Content-Type: application/x-www-form-urlencoded request header.
Without having to know anything about requests
, you can always make a file-like object out of anything you have in memory as a string using StringIO
.
In particular:
StringIO.StringIO(s)
is a binary file.cStringIO.StringIO(s)
is the same, but possibly more efficient.io.BytesIO(b)
is a binary file (constructed from bytes
).io.StringIO(s)
is a Unicode text file.io.BytesIO(s)
is a binary file.io.StringIO(u)
is a Unicode text file (constructed from unicode
).(The first two are "binary" in the Python 2 sense--no line-ending conversion. The others are "binary" vs. "text" in the Python 3 sense--bytes vs. Unicode.)
So, io.BytesIO(response.content)
gives you a valid binary file-like object in both Python 2 and Python 3. If you only care about Python 2, cStringIO.StringIO(response.content)
may be more efficient.
Of course "file-like" only goes so far; if the library tries to, e.g., call the fileno
method and start making C calls against the file descriptor it isn't going to work. But 99% of the time, this works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With