I have an numpy array that I brought in from a netCDF file with the shape (930, 360, 720) where it is organized as (time, latitudes, longitudes).
At each lat/lon pair for each of the 930 time stamps, I need to count the number of times that the value meets or exceeds a threshold "x" (such as 0.2 or 0.5 etc.) and ultimately calculate the percentage that the threshold was exceeded at each point, then output the results so they can be plotted later on.
I have attempted numerous methods but here is my most recent:
lat_length = len(lats)
#where lats has been defined earlier when unpacked from the netCDF dataset
lon_length = len(lons)
#just as lats; also these were defined before using np.meshgrid(lons, lats)
for i in range(0, lat_length):
for j in range(0, lon_length):
if ice[:,i,j] >= x:
#code to count number of occurrences here
#code to calculate percentage here
percent_ice[i,j] += count / len(time) #calculation
#then go on to plot percent_ice
I hope this makes sense! I would greatly appreciate any help. I'm self taught in Python so I may be missing something simple.
Would this be a time to use the any() function? What would be the most efficient way to count the number of times the threshold was exceeded and then calculate the percentage?
You can compare the input 3D array with the threshold x
and then sum along the first axis with ndarray.sum(axis=0)
to get the count and thereby the percentages, like so -
# Calculate count after thresholding with x and summing along first axis
count = (ice > x).sum(axis=0)
# Get percentages (ratios) by dividing with first axis length
percent_ice = np.true_divide(count,ice.shape[0])
Ah, look, another meteorologist!
There are probably multiple ways to do this and my solution is unlikely to be the fastest since it uses numpy's MaskedArray
, which is known to be slow, but this should work:
Numpy has a data type called a MaskedArray
which actually contains two normal numpy
arrays. It contains a data array as well as a boolean mask. I would first mask all data that are greater than or equal to my threshold (use np.ma.masked_greater()
for just greater than):
ice = np.ma.masked_greater_equal(ice)
You can then use ice.count()
to determine how many values are below your threshold for each lat/lon point by specifying that you want to count along a specific axis:
n_good = ice.count(axis=0)
This should return a 2-dimensional array containing the number of good points. You can then calculate the number of bad by subtracting n_good
from ice.shape[0]
:
n_bad = ice.shape[0] - n_good
and calculate the percentage that are bad using:
perc_bad = n_bad/float(ice.shape[0])
There are plenty of ways to do this without using MaskedArray
. This is just the easy way that comes to mind for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With