I have one big array:
[(1.0, 3.0, 1, 427338.4297000002, 4848489.4332)
(1.0, 3.0, 2, 427344.7937000003, 4848482.0692)
(1.0, 3.0, 3, 427346.4297000002, 4848472.7469) ...,
(1.0, 1.0, 7084, 427345.2709999997, 4848796.592)
(1.0, 1.0, 7085, 427352.9277999997, 4848790.9351)
(1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)]
I want to split this array into multiple arrays based on the 2nd value in the array (3.0, 3.0, 3.0...1.0,1.0,10).
Every time the 2nd value changes, I want a new array, so basically each new array has the same 2nd value. I've looked this up on Stack Overflow and know of the command
np.split(array, number)
but I'm not trying to split the array into a certain number of arrays, but rather by a value. How would I be able to split the array in the way specified above? Any help would be appreciated!
Splitting NumPy Arrays Splitting is reverse operation of Joining. Joining merges multiple arrays into one and Splitting breaks one array into multiple. We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.
divide() is a numpy library function used to perform division amongst the elements of the first array by the elements of the second array. The process of division occurs element-wise between the two arrays. The numpy divide() function takes two arrays as arguments and returns the same size as the input array.
To split a list into n parts in Python, use the numpy. array_split() function. The np. split() function splits the array into multiple sub-arrays.
You can find the indices where the values differ by using numpy.where
and numpy.diff
on the first column:
>>> arr = np.array([(1.0, 3.0, 1, 427338.4297000002, 4848489.4332),
(1.0, 3.0, 2, 427344.7937000003, 4848482.0692),
(1.0, 3.0, 3, 427346.4297000002, 4848472.7469),
(1.0, 1.0, 7084, 427345.2709999997, 4848796.592),
(1.0, 1.0, 7085, 427352.9277999997, 4848790.9351),
(1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)])
>>> np.split(arr, np.where(np.diff(arr[:,1]))[0]+1)
[array([[ 1.00000000e+00, 3.00000000e+00, 1.00000000e+00,
4.27338430e+05, 4.84848943e+06],
[ 1.00000000e+00, 3.00000000e+00, 2.00000000e+00,
4.27344794e+05, 4.84848207e+06],
[ 1.00000000e+00, 3.00000000e+00, 3.00000000e+00,
4.27346430e+05, 4.84847275e+06]]),
array([[ 1.00000000e+00, 1.00000000e+00, 7.08400000e+03,
4.27345271e+05, 4.84879659e+06],
[ 1.00000000e+00, 1.00000000e+00, 7.08500000e+03,
4.27352928e+05, 4.84879094e+06],
[ 1.00000000e+00, 1.00000000e+00, 7.08600000e+03,
4.27359161e+05, 4.84878743e+06]])]
Explanation:
Here first we are going to fetch the items in the second 2 column:
>>> arr[:,1]
array([ 3., 3., 3., 1., 1., 1.])
Now to find out where the items actually change we can use numpy.diff
:
>>> np.diff(arr[:,1])
array([ 0., 0., -2., 0., 0.])
Any thing non-zero means that the item next to it was different, we can use numpy.where
to find the indices of non-zero items and then add 1 to it because the actual index of such item is one more than the returned index:
>>> np.where(np.diff(arr[:,1]))[0]+1
array([3])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With