I have GPS data of ice speed from three different GPS receivers. The data are in a pandas dataframe with an index of julian day (incremental from the start of 2009).
This is a subset of the data (the main dataset is 3487235 rows...):
R2 R7 R8 1235.000000 116.321959 100.805197 96.519977 1235.000116 NaN 100.771133 96.234957 1235.000231 NaN 100.584559 97.249262 1235.000347 118.823610 100.169055 96.777833 1235.000463 NaN 99.753551 96.598350 1235.000579 NaN 99.338048 95.283989 1235.000694 113.995003 98.922544 95.154067
The dataframe has form:
Index: 6071320 entries, 127.67291667 to 1338.51805556 Data columns: R2 3487235 non-null values R7 3875864 non-null values R8 1092430 non-null values dtypes: float64(3)
R2 sampled at a different rate to R7 and R8 hence the NaNs which appear systematically at that spacing.
Trying df.plot()
to plot the whole dataframe (or indexed row locations thereof) works fine in terms of plotting R7 and R8, but doesn't plot R2. Similarly, just doing df.R2.plot()
also doesn't work. The only way to plot R2 is to do df.R2.dropna().plot()
, but this also removes NaNs which signify periods of no data (rather than just a coarser sampling frequency than the other receivers).
Has anyone else come across this? Any ideas on the problem would be gratefully received :)
This is what Pandas documentation gives: na_values : scalar, str, list-like, or dict, optional Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.
The reason your not seeing anything is because the default plot style is only a line. But the line gets interupted at NaN's so only multiple consequtive values will be plotted. And the latter doesnt happen in your case. You need to change the style of plotting, which depends on what you want to see.
For starters, try adding:
.plot(marker='o')
That should make all data points appear as circles. It easily gets cluttered so adjusting markersize, edgecolor etc might be usefull. Im not fully adjusted to how Pandas is using matplotlib so i often switch to matplotlib myself if plots get more complicated, eg:
plt.plot(df.R2.index.to_pydatetime(), df.R2, 'o-')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With