Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

convert io.StringIO to io.BytesIO

Tags:

original question: i got a StringIO object, how can i convert it into BytesIO?

update: The more general question is, how to convert a binary (encoded) file-like object into decoded file-like object in python3?

the naive approach i got is:

import io sio = io.StringIO('wello horld') bio = io.BytesIO(sio.read().encode('utf8')) print(bio.read())  # prints b'wello horld' 

is there more efficient and elegant way of doing this? the above code just reads everything into memory, encodes it instead of streaming the data in chunks.

for example, for the reverse question (BytesIO -> StringIO) there exist a class - io.TextIOWrapper which does exactly that (see this answer)

like image 396
ShmulikA Avatar asked Apr 28 '19 10:04

ShmulikA


People also ask

What does BytesIO () do in Python?

Python StringIO and BytesIO Compared With Open() StringIO and BytesIO are methods that manipulate string and bytes data in memory. StringIO is used for string data and BytesIO is used for binary data. This classes create file like object that operate on string data.

What does io BytesIO return?

It takes input POSIX based arguments and returns a file descriptor which represents the opened file. It does not return a file object; the returned value will not have read() or write() functions.

What is io StringIO ()?

The StringIO module is an in-memory file-like object. This object can be used as input or output to the most function that would expect a standard file object.


2 Answers

It's interesting that though the question might seem reasonable, it's not that easy to figure out a practical reason why I would need to convert a StringIO into a BytesIO. Both are basically buffers and you usually need only one of them to make some additional manipulations either with the bytes or with the text.

I may be wrong, but I think your question is actually how to use a BytesIO instance when some code to which you want to pass it expects a text file.

In which case, it is a common question and the solution is codecs module.

The two usual cases of using it are the following:

Compose a File Object to Read

In [16]: import codecs, io  In [17]: bio = io.BytesIO(b'qwe\nasd\n')  In [18]: StreamReader = codecs.getreader('utf-8')  # here you pass the encoding  In [19]: wrapper_file = StreamReader(bio)  In [20]: print(repr(wrapper_file.readline())) 'qwe\n'  In [21]: print(repr(wrapper_file.read())) 'asd\n'  In [26]: bio.seek(0) Out[26]: 0  In [27]: for line in wrapper_file:     ...:     print(repr(line))     ...: 'qwe\n' 'asd\n' 

Compose a File Object to Write To

In [28]: bio = io.BytesIO()  In [29]: StreamWriter = codecs.getwriter('utf-8')  # here you pass the encoding  In [30]: wrapper_file = StreamWriter(bio)  In [31]: print('жаба', 'цап', file=wrapper_file)  In [32]: bio.getvalue() Out[32]: b'\xd0\xb6\xd0\xb0\xd0\xb1\xd0\xb0 \xd1\x86\xd0\xb0\xd0\xbf\n'  In [33]: repr(bio.getvalue().decode('utf-8')) Out[33]: "'жаба цап\\n'" 
like image 168
newtover Avatar answered Sep 23 '22 20:09

newtover


@foobarna answer can be improved by inheriting some io base-class

import io sio = io.StringIO('wello horld')   class BytesIOWrapper(io.BufferedReader):     """Wrap a buffered bytes stream over TextIOBase string stream."""      def __init__(self, text_io_buffer, encoding=None, errors=None, **kwargs):         super(BytesIOWrapper, self).__init__(text_io_buffer, **kwargs)         self.encoding = encoding or text_io_buffer.encoding or 'utf-8'         self.errors = errors or text_io_buffer.errors or 'strict'      def _encoding_call(self, method_name, *args, **kwargs):         raw_method = getattr(self.raw, method_name)         val = raw_method(*args, **kwargs)         return val.encode(self.encoding, errors=self.errors)      def read(self, size=-1):         return self._encoding_call('read', size)      def read1(self, size=-1):         return self._encoding_call('read1', size)      def peek(self, size=-1):         return self._encoding_call('peek', size)   bio = BytesIOWrapper(sio) print(bio.read())  # b'wello horld' 
like image 33
imposeren Avatar answered Sep 22 '22 20:09

imposeren