Is there an efficient way of creating a 2D array of the values from unsorted coordinate points (i.e. not all lons and/or lats are ascending or descending) without using loops?
Example Data
lats = np.array([45.5,45.5,45.5,65.3,65.3,65.3,43.2,43.2,43.2,65.3])
lons = np.array([102.5,5.5,116.2,102.5,5.5,116.2,102.5,5.5,116.2,100])
vals = np.array([3,4,5,6,7,7,9,1,0,4])
Example Output
Each column represents a unique longitude (102.5, 5.5, 116.2, & 100) and each column represents a unique latitude (45.5,65.3, & 43.2).
([ 3, 4, 5, NaN],
[ 6, 7, 7, 4],
[ 9, 1, 0, NaN])
Though, it isn't so straight forward because I don't necessarily know how many duplicates of each lon or lat there are which determines the shape of the array.
Update:
I had the data arranged incorrectly for my question. I have arranged it now, so they are all unique pairs and there is an additional data point to demonstrate how the data should be arranged when NaNs are present.
The example you have posted makes very little sense, and it doesn't allow any reasonable way to specify missing data. I am guessing here, but the only reasonable thing you may be dealing with seems to be something like this :
>>> lats = np.array([43.2, 43.2, 43.2, 45.5, 45.5, 45.5, 65.3, 65.3, 65.3])
>>> lons = np.array([5.5, 102.5, 116.2, 5.5, 102.5, 116.2, 5.5, 102.5, 116.2])
>>> vals = np.array([3, 4, 5, 6, 7, 7, 9, 1, 0])
Where the value in vals[j]
comes from latitude lats[j]
and longitude lons[j]
, but the data may come scrambled, as in :
>>> indices = np.arange(9)
>>> np.random.shuffle(indices)
>>> lats = lats[indices]
>>> lons = lons[indices]
>>> vals = vals[indices]
>>> lats
array([ 45.5, 43.2, 65.3, 45.5, 43.2, 65.3, 45.5, 65.3, 43.2])
>>> lons
array([ 5.5, 116.2, 102.5, 116.2, 5.5, 116.2, 102.5, 5.5, 102.5])
>>> vals
array([6, 5, 1, 7, 3, 0, 7, 9, 4])
You can get this arranged into an array as follows:
>>> lat_vals, lat_idx = np.unique(lats, return_inverse=True)
>>> lon_vals, lon_idx = np.unique(lons, return_inverse=True)
>>> vals_array = np.empty(lat_vals.shape + lon_vals.shape)
>>> vals_array.fill(np.nan) # or whatever yor desired missing data flag is
>>> vals_array[lat_idx, lon_idx] = vals
>>> vals_array
array([[ 3., 4., 5.],
[ 6., 7., 7.],
[ 9., 1., 0.]])
If you're creating a 2D array, then all arrays will have to have the same number of points. If this is true, you can simply do
out = np.vstack((lats, lons, vals))
I think this might be what you're after, it matches your question at least :)
xsize = len(np.unique(lats))
ysize = len(np.unique(lons))
and then if your data is very well behaved
out = [vals[i] for i, (x, y) in enumerate(zip(lats, lons))]
out = np.asarray(out).reshape((xsize, ysize))
import numpy as np
lats = np.array([45.5,45.5,45.5,65.3,65.3,65.3,43.2,43.2,43.2,65.3])
lons = np.array([102.5,5.5,116.2,102.5,5.5,116.2,102.5,5.5,116.2,100])
vals = np.array([3,4,5,6,7,7,9,1,0,4])
def unique_order(seq):
# http://www.peterbe.com/plog/uniqifiers-benchmark (Dave Kirby)
# Order preserving
seen = set()
return [x for x in seq if x not in seen and not seen.add(x)]
unique_lats, idx_lats = np.unique(lats, return_inverse=True)
unique_lons, idx_lons = np.unique(lons, return_inverse=True)
perm_lats = np.argsort(unique_order(lats))
perm_lons = np.argsort(unique_order(lons))
result = np.empty((len(unique_lats), len(unique_lons)))
result.fill(np.nan)
result[perm_lats[idx_lats], perm_lons[idx_lons]] = vals
print(result)
yields
[[ 3. 4. 5. nan]
[ 6. 7. 7. 4.]
[ 9. 1. 0. nan]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With