Suppose I have two arrays (after import numpy as np),
a=np.array([['a',1],['b',2]],dtype=object)
and
b=np.array([['b',3],['c',4]],dtype=object)
How do I get:
c=np.array([['a',1,None],['b',2,3],['c',None,4]],dtype=object)
Basically, an join using the first column as key.
Thanks
Use numpy. concatenate() to merge the content of two or multiple arrays into a single array. This function takes several arguments along with the NumPy arrays to concatenate and returns a Numpy array ndarray. Note that this method also takes axis as another argument, when not specified it defaults to 0.
NumPy's concatenate function can be used to concatenate two arrays either row-wise or column-wise. Concatenate function can take two or more arrays of the same shape and by default it concatenates row-wise i.e. axis=0. The resulting array after row-wise concatenation is of the shape 6 x 3, i.e. 6 rows and 3 columns.
The concat() method concatenates (joins) two or more arrays. The concat() method returns a new array, containing the joined arrays. The concat() method does not change the existing arrays.
We can use np. column_stack() to combine two 1-D arrays X and Y into a 2-D array. Then, we can use pd. DataFrame to change it into a dataframe.
A pure Python approach to do this would be
da = dict(a)
db = dict(b)
c = np.array([(k, da.get(k), db.get(k))
for k in set(da.iterkeys()).union(db.iterkeys())])
But if you are using NumPy, your arrays are probably big, and you are looking for a solution with a better performance. In this case, I suggest using some real database to do this, for example the sqlite3
module that comes with Python.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With