I have two array looks like this:
20131010 123 12321 12312312
20131011 123 12321 12312312
20131012 123 12321 12312312
20131013 123 12321 12312312
20131010 bbbb sad sadsad
20131011 asd asdd asdad
20231012 123 12321 12312312
20141013 123 12321 12312312
20141023 123 12321 12312312
Now I need to inner join these two array by the first column(date), the result should looks like this:
20131010 123 12321 12312312 bbbb sad sadsad
20131011 123 12321 12312312 asd asdd asdad
How do I make it ? Note that each have lots of columns so I can't name every column, but the compare column is indeed only one.
This is horribly under documented, but check out numpy.lib.recfunctions.join_by
. It will do several kinds of SQL like joins, including inner join. I don't see this module on the numpy page, but at least the doc strings give you some info (copied from 1.9.1 below).
Note that this looks like it requires a structured array to work, so you may have to cast into a recarray, rather than just say "join on column 0"
Join arrays `r1` and `r2` on key `key`.
The key should be either a string or a sequence of string corresponding
to the fields used to join the array. An exception is raised if the
`key` field cannot be found in the two input arrays. Neither `r1` nor
`r2` should have any duplicates along `key`: the presence of duplicates
will make the output quite unreliable. Note that duplicates are not
looked for by the algorithm.
Parameters
----------
key : {string, sequence}
A string or a sequence of strings corresponding to the fields used
for comparison.
r1, r2 : arrays
Structured arrays.
jointype : {'inner', 'outer', 'leftouter'}, optional
If 'inner', returns the elements common to both r1 and r2.
If 'outer', returns the common elements as well as the elements of
r1 not in r2 and the elements of not in r2.
If 'leftouter', returns the common elements and the elements of r1
not in r2.
r1postfix : string, optional
String appended to the names of the fields of r1 that are present
in r2 but absent of the key.
r2postfix : string, optional
String appended to the names of the fields of r2 that are present
in r1 but absent of the key.
defaults : {dictionary}, optional
Dictionary mapping field names to the corresponding default values.
usemask : {True, False}, optional
Whether to return a MaskedArray (or MaskedRecords is
`asrecarray==True`) or a ndarray.
asrecarray : {False, True}, optional
Whether to return a recarray (or MaskedRecords if `usemask==True`)
or just a flexible-type ndarray.
Notes
-----
* The output is sorted along the key.
* A temporary array is formed by dropping the fields not in the key for
the two arrays and concatenating the result. This array is then
sorted, and the common entries selected. The output is constructed by
filling the fields with the selected entries. Matching is not
preserved if there are some duplicates...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With