I have a 2D array of Numpy data read from a .csv file. Each row represents a data point with the final column containing a a 'key' which corresponds uniquely to 'key' in another Numpy array - the 'lookup table' as it were.
What is the best (most Numpythonic) way to match up the lines in the first table with the values in the second?
WeldNumpy is a Weld-enabled library that provides a subclass of NumPy's ndarray module, called weldarray, which supports automatic parallelization, lazy evaluation, and various other optimizations for data science workloads.
NumPy is a Python library that provides a simple yet powerful data structure: the n-dimensional array. This is the foundation on which almost all the power of Python's data science toolkit is built, and learning NumPy is the first step on any Python data scientist's journey.
Also as expected, the Numpy array performed faster than the dictionary. However, the dictionary performed faster in Python than in Julia.
Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.
In the special case when the index can be calculated from the keys, the dictionary can be avoided. It's an advantage when the key of the lookup table can be chosen.
For Vebjorn Ljosa's example:
lookup:
>>> lookup[a[:,0]-1, :]
array([[ 1. , 3.14 , 4.14 ],
[ 1. , 3.14 , 4.14 ],
[ 2. , 2.71818, 3.7 ],
[ 3. , 42. , 43. ]])
merge:
>>> np.hstack([a, lookup[a[:,0]-1, :]])
array([[ 1. , 11. , 1. , 3.14 , 4.14 ],
[ 1. , 12. , 1. , 3.14 , 4.14 ],
[ 2. , 21. , 2. , 2.71818, 3.7 ],
[ 3. , 31. , 3. , 42. , 43. ]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With