Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to read numpy 2D array from string?

Tags:

How can I read a Numpy array from a string? Take a string like:

"[[ 0.5544  0.4456], [ 0.8811  0.1189]]"

and convert it to an array:

a = from_string("[[ 0.5544  0.4456], [ 0.8811  0.1189]]")

where a becomes the object: np.array([[0.5544, 0.4456], [0.8811, 0.1189]]).

I'm looking for a very simple interface. A way to convert 2D arrays (of floats) to a string and then a way to read them back to reconstruct the array:

arr_to_string(array([[0.5544, 0.4456], [0.8811, 0.1189]])) should return "[[ 0.5544 0.4456], [ 0.8811 0.1189]]".

string_to_arr("[[ 0.5544 0.4456], [ 0.8811 0.1189]]") should return the object array([[0.5544, 0.4456], [0.8811, 0.1189]]).

Ideally arr_to_string would have a precision parameter that controlled the precision of floating points converted to strings, so that you wouldn't get entries like 0.4444444999999999999999999.

There's nothing I can find in the NumPy docs that does this both ways. np.save lets you make a string but then there's no way to load it back in (np.load only works for files).

like image 229
mvd Avatar asked Feb 24 '16 20:02

mvd


People also ask

How do I read a 2D array of strings in Python?

A way to convert 2D arrays (of floats) to a string and then a way to read them back to reconstruct the array: arr_to_string(array([[0.5544, 0.4456], [0.8811, 0.1189]])) should return "[[ 0.5544 0.4456], [ 0.8811 0.1189]]" .

How do you access a 2D NumPy array?

Indexing a Two-dimensional Array To access elements in this array, use two indices. One for the row and the other for the column. Note that both the column and the row indices start with 0. So if I need to access the value '10,' use the index '3' for the row and index '1' for the column.

How do you make a 2D NumPy array?

To create a NumPy array, you can use the function np. array() . All you need to do to create a simple array is pass a list to it. If you choose to, you can also specify the type of data in your list.

Does NumPy work with strings?

The numpy. char module provides a set of vectorized string operations for arrays of type numpy.


2 Answers

The challenge is to save not only the data buffer, but also the shape and dtype. np.fromstring reads the data buffer, but as a 1d array; you have to get the dtype and shape from else where.

In [184]: a=np.arange(12).reshape(3,4)

In [185]: np.fromstring(a.tostring(),int)
Out[185]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [186]: np.fromstring(a.tostring(),a.dtype).reshape(a.shape)
Out[186]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

A time honored mechanism to save Python objects is pickle, and numpy is pickle compliant:

In [169]: import pickle

In [170]: a=np.arange(12).reshape(3,4)

In [171]: s=pickle.dumps(a*2)

In [172]: s
Out[172]: "cnumpy.core.multiarray\n_reconstruct\np0\n(cnumpy\nndarray\np1\n(I0\ntp2\nS'b'\np3\ntp4\nRp5\n(I1\n(I3\nI4\ntp6\ncnumpy\ndtype\np7\n(S'i4'\np8\nI0\nI1\ntp9\nRp10\n(I3\nS'<'\np11\nNNNI-1\nI-1\nI0\ntp12\nbI00\nS'\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x06\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\n\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x0e\\x00\\x00\\x00\\x10\\x00\\x00\\x00\\x12\\x00\\x00\\x00\\x14\\x00\\x00\\x00\\x16\\x00\\x00\\x00'\np13\ntp14\nb."

In [173]: pickle.loads(s)
Out[173]: 
array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

There's a numpy function that can read the pickle string:

In [181]: np.loads(s)
Out[181]: 
array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

You mentioned np.save to a string, but that you can't use np.load. A way around that is to step further into the code, and use np.lib.npyio.format.

In [174]: import StringIO

In [175]: S=StringIO.StringIO()  # a file like string buffer

In [176]: np.lib.npyio.format.write_array(S,a*3.3)

In [177]: S.seek(0)   # rewind the string

In [178]: np.lib.npyio.format.read_array(S)
Out[178]: 
array([[  0. ,   3.3,   6.6,   9.9],
       [ 13.2,  16.5,  19.8,  23.1],
       [ 26.4,  29.7,  33. ,  36.3]])

The save string has a header with dtype and shape info:

In [179]: S.seek(0)

In [180]: S.readlines()
Out[180]: 
["\x93NUMPY\x01\x00F\x00{'descr': '<f8', 'fortran_order': False, 'shape': (3, 4), }          \n",
 '\x00\x00\x00\x00\x00\x00\x00\x00ffffff\n',
 '@ffffff\x1a@\xcc\xcc\xcc\xcc\xcc\xcc#@ffffff*@\x00\x00\x00\x00\x00\x800@\xcc\xcc\xcc\xcc\xcc\xcc3@\x99\x99\x99\x99\x99\x197@ffffff:@33333\xb3=@\x00\x00\x00\x00\x00\x80@@fffff&B@']

If you want a human readable string, you might try json.

In [196]: import json

In [197]: js=json.dumps(a.tolist())

In [198]: js
Out[198]: '[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]'

In [199]: np.array(json.loads(js))
Out[199]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Going to/from the list representation of the array is the most obvious use of json. Someone may have written a more elaborate json representation of arrays.

You could also go the csv format route - there have been lots of questions about reading/writing csv arrays.


'[[ 0.5544  0.4456], [ 0.8811  0.1189]]'

is a poor string representation for this purpose. It does look a lot like the str() of an array, but with , instead of \n. But there isn't a clean way of parsing the nested [], and the missing delimiter is a pain. If it consistently uses , then json can convert it to list.

np.matrix accepts a MATLAB like string:

In [207]: np.matrix(' 0.5544,  0.4456;0.8811,  0.1189')
Out[207]: 
matrix([[ 0.5544,  0.4456],
        [ 0.8811,  0.1189]])

In [208]: str(np.matrix(' 0.5544,  0.4456;0.8811,  0.1189'))
Out[208]: '[[ 0.5544  0.4456]\n [ 0.8811  0.1189]]'
like image 197
hpaulj Avatar answered Sep 19 '22 03:09

hpaulj


Forward to string:

import numpy as np
def array2str(arr, precision=None):
    s=np.array_str(arr, precision=precision)
    return s.replace('\n', ',')

Backward to array:

import re
import ast
import numpy as np
def str2array(s):
    # Remove space after [
    s=re.sub('\[ +', '[', s.strip())
    # Replace commas and spaces
    s=re.sub('[,\s]+', ', ', s)
    return np.array(ast.literal_eval(s))

If you use repr() to convert array to string, the conversion will be trivial.

like image 24
Peijun Zhu Avatar answered Sep 19 '22 03:09

Peijun Zhu