Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python writing binary files, bytes

Python 3. I'm using QT's file dialog widget to save PDFs downloaded from the internet. I've been reading the file using 'open', and attempting to write it using the file dialog widget. However, I've been running into a"TypeError: '_io.BufferedReader' does not support the buffer interface" error.

Example code:

with open('file_to_read.pdf', 'rb') as f1: 
    with open('file_to_save.pdf', 'wb') as f2:
        f2.write(f1)

This logic works properly with text files when not using the 'b' designator, or when reading a file from the web, like with urllib or requests. These are of the 'bytes' type, which I think I need to be opening the file as. Instead, it's opening as a Buffered Reader. I tried bytes(f1), but get "TypeError: 'bytes' object cannot be interpreted as an integer." Any ideaas?

like image 919
Turtles Are Cute Avatar asked May 19 '13 02:05

Turtles Are Cute


People also ask

How do you write data into a binary file in Python?

First, open a file in binary write mode and then specify the contents to write in the form of bytes. Next, use the write function to write the byte contents to a binary file.

How does Python handle binary files?

To open a file in binary format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format for reading, while the "wb" mode opens the file in binary format for writing. Unlike text files, binary files are not human-readable. When opened using any text editor, the data is unrecognizable.

What is binary format in Python?

"Binary" files are any files where the format isn't made up of readable characters. Binary files can range from image files like JPEGs or GIFs, audio files like MP3s or binary document formats like Word or PDF. In Python, files are opened in text mode by default.

How does Python read byte data?

Python Read Binary File into Byte Array First, the file is opened in the“ rb “ mode. A byte array called mybytearray is initialized using the bytearray() method. Then the file is read one byte at a time using f. read(1) and appended to the byte array using += operator.


1 Answers

If your intent is to simply make a copy of the file, you could use shutil

>>> import shutil
>>> shutil.copyfile('file_to_read.pdf','file_to_save.pdf')

Or if you need to access byte by byte, similar to your structure, this works:

>>> with open('/tmp/fin.pdf','rb') as f1:
...    with open('/tmp/test.pdf','wb') as f2:
...       while True:
...          b=f1.read(1)
...          if b: 
...             # process b if this is your intent   
...             n=f2.write(b)
...          else: break

But byte by byte is potentially really slow.

Or, if you want a buffer that will speed this up (without taking the risk of reading an unknown file size completely into memory):

>>> with open('/tmp/fin.pdf','rb') as f1:
...    with open('/tmp/test.pdf','wb') as f2:
...       while True:
...          buf=f1.read(1024)
...          if buf: 
...              for byte in buf:
...                 pass    # process the bytes if this is what you want
...                         # make sure your changes are in buf
...              n=f2.write(buf)
...          else:
...              break

With Python 2.7+ or 3.1+ you can also use this shortcut (rather than using two with blocks):

with open('/tmp/fin.pdf','rb') as f1,open('/tmp/test.pdf','wb') as f2:
    ...
like image 100
dawg Avatar answered Sep 28 '22 12:09

dawg