I am trying to convert 'b' (a string in which the column entries are separated by one delimiter and the the rows are separated by another delimiter) to 'a' (a 2d numpy array), like:
b='191.250\t0.00\t0\t1\n191.251\t0.00\t0\t1\n191.252\t0.00\t0\t1\n'
a=numpy.array([[191.25,0,0,1],[191.251,0,0,1],[191.252,0,0,1]])
The way I do it is (using my knowledge that there are 4 columns in 'a'):
a=numpy.array(filter(None,re.split('[\n\t]+',b)),dtype=float).reshape(-1,4)
Is there a better way?
Instead of splitting and filtering, you could use np.fromstring:
>>> np.fromstring(b, sep='\t').reshape(-1, 4)
array([[ 191.25 , 0. , 0. , 1. ],
[ 191.251, 0. , 0. , 1. ],
[ 191.252, 0. , 0. , 1. ]])
This always returns a 1D array so reshaping is necessary.
Alternatively, to avoid reshaping, if you already have a string of bytes (as strings are in Python 2), you could use np.genfromtxt (with the help of the standard library's io module):
>>> import io
>>> np.genfromtxt(io.BytesIO(b))
array([[ 191.25 , 0. , 0. , 1. ],
[ 191.251, 0. , 0. , 1. ],
[ 191.252, 0. , 0. , 1. ]])
genfromtxt handles missing values, as well as offering much more control over how the final array is created.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With