I am working on a set of data (x={time},y={measure})
that comes out from an instrument, but sometimes the source cause a spike on data, which cause an incorrect plot and can cause mistakes in calculating features like max and min.
So I need to remove these spikes from my data, for examples the spikes surrounded by the red circle in the image:
I have found this example for de-spiking but I don't know how to invert the signal (and if it's correct on a non-symmetric signal) and I think it's just for detecting the spikes and I need to remove them with operations like fitting etc...
I need help to know if there are better ways to accomplish my task or if i have simply to adapt the example above to my situation (in that case I'll need help because I have no idea how to do it).
Sometimes data exhibit unwanted transients, or spikes. Median filtering is a natural way to eliminate them. Consider the open-loop voltage across the input of an analog instrument in the presence of 60 Hz power-line noise.
Overview of spike filtering The purpose of a spike filter is to suppress extreme changes in measured variable values, since they probably don't reflect actual changes in the monitored process. Small input changes are passed through without modification.
Here is a set of steps you can follow to estimate the location of peaks:
Smooth the data. Any number of filters are available for this. An excellent starting point is the smooth
function described in the scipy cookbook. It will be up to you to select the appropriate parameters like window size:
baseline = smooth(data, ...)
Treat the smoothed data as a baseline, sort of like a best fit line in the absence of a known fitting function. Subtract the baseline from the data:
noise = data - baseline
The result is essentially a rough estimate of the noise about your pseudo-fit. Set a threshold and chop of the parts where the noise is too much:
threshold = 3.0 * np.std(noise)
mask = np.abs(noise) > threshold
There are plenty of configuration options to play with here: smoothing filter type and window size, threshold factor and even metric. E.g., you can use IQR or something entirely different instead of standard deviation. What you do with the masked points is also entirely up to you. Common options are to discard entirely or to replace with the baseline values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With