I have quite a huge historical meteo station csv dataset (daily wind speed data from a set of weather stations for a region) and I would need to compute the average number of days per month in which wind speed is higher than 6 m/s for each meteo station. The stations does not contain data for the same number of years. An example of the dataset is shown below.
head(windspeed_PR)
STN Year Month Day WDSP WDSP.ms
1 860110 1974 6 19 9.3 4.784
2 860110 1974 7 13 19.0 9.774
3 860110 1974 7 22 9.9 5.093
4 860110 1974 8 20 9.5 4.887
5 860110 1974 9 10 3.3 1.698
6 860110 1974 10 10 6.6 3.395
Therefore, I basically would need to count how many WDPS.ms values are higher than 6 for each Month of the Year and each station (STN), and then calculate the average number of days per month per meteo station
Could I please have suggestions on how to compute this value (preferibly in R)?
This is fairly straightforward.
Using dplyr
:
library(dplyr)
windspeed_PR %>%
group_by(STN, Year, Month) %>%
summarize(n_days = n(),
n_gt6 = sum(WDSP.ms > 6),
p_gt6 = n_gt6 / n_days)
This will return, for each station, year, month, the number of measurements, the number of measurements greater than 6, and their quotient (the proportion of measurements greater than 6).
It's not clear to me from you question if you want this further summarized (say, collapsing years), but it should form a good starting place for any additional work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With