Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

concatenate numpy string array along an axis?

I have a 2-d numpy array of strings. Is there a way to concatenate the strings in each row and then join the resulting strings with a separator string, e.g. a newline?

Example:

pic = np.array([ 'H','e','l','l','o','W','o','r','l','d']).reshape(2,5)

I want to get:

"Hello\nWorld\n"
like image 644
ErikR Avatar asked Sep 18 '15 18:09

ErikR


People also ask

How do you concatenate an array in NumPy?

numpy.concatenate () function concatenate a sequence of arrays along an existing axis. arr1, arr2, … : [sequence of array_like] The arrays must have the same shape, except in the dimension corresponding to axis. axis : [int, optional] The axis along which the arrays will be joined.

How to join arrays along an existing axis in NumPy?

numpy.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")¶ Join a sequence of arrays along an existing axis. Parameters a1, a2, …sequence of array_like The arrays must have the same shape, except in the dimension corresponding to axis(the first, by default). axisint, optional The axis along which the arrays will be joined.

What is concatenation in Python?

Concatenation refers to joining. This function is used to join two or more arrays of the same shape along a specified axis. The function takes the following parameters.

What happens if axis is none in concatenate?

If axis is None, arrays are flattened before use. Default is 0. out : [ndarray, optional] If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified.


2 Answers

You had the right ideas there. Here's a vectorized NumPythonic implementation trying to go along those ideas -

# Create a separator string of the same rows as input array
separator_str = np.repeat(['\n'], pic.shape[0])[:,None]

# Concatenate these two and convert to string for final output
out = np.concatenate((pic,separator_str),axis=1).tostring()

Or a one-liner with np.column_stack -

np.column_stack((pic,np.repeat(['\n'], pic.shape[0])[:,None])).tostring()

Sample run -

In [123]: pic
Out[123]: 
array([['H', 'e', 'l', 'l', 'o'],
       ['W', 'o', 'r', 'l', 'd']], 
      dtype='|S1')

In [124]: np.column_stack((pic,np.repeat(['\n'], pic.shape[0])[:,None])).tostring()
Out[124]: 'Hello\nWorld\n'
like image 195
Divakar Avatar answered Nov 09 '22 10:11

Divakar


It's not hard to do outside of numpy:

>>> import numpy as np
>>> pic = np.array([ 'H','e','l','l','o','W','o','r','l','d']).reshape(2,5)
>>> pic
array([['H', 'e', 'l', 'l', 'o'],
       ['W', 'o', 'r', 'l', 'd']], 
      dtype='|S1')
>>> '\n'.join([''.join(row) for row in pic])
'Hello\nWorld'

There is also the np.core.defchararray module which has "goodies" for working with character arrays -- However, it states that these are merely wrappers around the python builtin and standard library functions so you'll probably not get any real speedup by using them.

like image 20
mgilson Avatar answered Nov 09 '22 10:11

mgilson