Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NumPy - What is the difference between frombuffer and fromstring?

Tags:

python

numpy

They appear to give the same result to me:

In [32]: s Out[32]: '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x15\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'  In [27]: np.frombuffer(s, dtype="int8") Out[27]: array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0, 21,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0], dtype=int8)  In [28]: np.fromstring(s, dtype="int8") Out[28]: array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0, 21,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0], dtype=int8)  In [33]: b = buffer(s)  In [34]: b Out[34]: <read-only buffer for 0x035F8020, size -1, offset 0 at 0x036F13A0>  In [35]: np.fromstring(b, dtype="int8") Out[35]: array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0, 21,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0], dtype=int8)  In [36]: np.frombuffer(b, dtype="int8") Out[36]: array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0, 21,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0], dtype=int8) 

When should one be used vs. the other?

like image 792
user202987 Avatar asked Mar 06 '14 21:03

user202987


People also ask

What is frombuffer?

The Numpy frombuffer() is one of the predefined function that is used to create the array using the buffer storage with specific areas; mainly, this buffer function is creating the arrays with a different set of parameters it returns the array version of the buffer the python interpreter of the numpy frombuffer() ...

What is a buffer in NumPy?

¶ The Python buffer protocol, also known in the community as PEP 3118, is a framework in which Python objects can expose raw byte arrays to other Python objects. This can be extremely useful for scientific computing, where we often use packages such as NumPy to efficiently store and manipulate large arrays of data.


1 Answers

From a practical standpoint, the difference is that:

x = np.fromstring(s, dtype='int8') 

Will make a copy of the string in memory, while:

x = np.frombuffer(s, dtype='int8') 

or

x = np.frombuffer(buffer(s), dtype='int8') 

Will use the memory buffer of the string directly and won't use any* additional memory. Using frombuffer will also result in a read-only array if the input to buffer is a string, as strings are immutable in python.

(*Neglecting a few bytes of memory used for an additional python ndarray object -- The underlying memory for the data will be shared.)


If you're not familiar with buffer objects (memoryview in python3.x), they're essentially a way for C-level libraries to expose a block of memory for use in python. It's basically a python interface for managed access to raw memory.

If you were working with something that exposed the buffer interface, then you'd probably want to use frombuffer. (Python 2.x strings and python 3.x bytes expose the buffer interface, but you'll get a read-only array, as python strings are immutable.)

Otherwise, use fromstring to create a numpy array from a string. (Unless you know what you're doing, and want to tightly control memory use, etc.)

like image 53
Joe Kington Avatar answered Sep 29 '22 10:09

Joe Kington