Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detrend with mask Python

I have a file which I read as follow:

data1 = np.loadtxt('lc1.out') 
x = data1[:, 0]
y = data1[:, 1] 

untreand

I would like to detrend it and I found a very useful link here.

model = np.polyfit(x, y, 2)
predicted = np.polyval(model, x)

Anyhow, I would like to mask a part of the data such as I will fit using only points outside the mask. For instance, I want to use only data lower than 639.5 and larger than 641.5 and the second-order polynomial fit.

I had the idea to use ma.masked_outside(x, 639.5, 641.5) such as it will be easy to save in an array only elements outside the mask...but I do not understand how to cast it with the polyfit.

like image 249
Panichi Pattumeros PapaCastoro Avatar asked May 16 '26 19:05

Panichi Pattumeros PapaCastoro


1 Answers

There might not be a hard reason to use masked arrays for your use case, unless it's for performance reasons or for making further use of the mask. So I'm going to show how to do it with and without a masked array.

But let's first detrend without masking in order to have a reference:

Second order polynomial detrending without masking

import numpy as np
import matplotlib.pyplot as plt

data1 = np.loadtxt('lc1.out')
x, y = data1.T

fig = plt.figure()

plt.subplot(2, 1, 1)
plt.title('polyfit, original data set')
plt.plot(x, y, 'c.')

coeff = np.polyfit(x, y, 2)

# no need to use the original x values here just for visualizing the polynomial
x_poly = np.linspace(x.min(), x.max())
y_poly = np.polyval(coeff, x_poly)
plt.plot(x_poly, y_poly, 'r-', linewidth=3)

mid = len(x_poly) // 2
plt.annotate('y = {:.7g} x\xB2 + {:.7g} x + {:.7g}'.format(*coeff),
             (x_poly[mid], y_poly[mid]), (0, 48), textcoords='offset points',
             arrowprops={'arrowstyle': '->'}, horizontalalignment='center')

plt.subplot(2, 1, 2)
plt.title('detrended')

# we need the original x values here, so we can remove the trend from all points
trend = np.polyval(coeff, x)
# note that simply subtracting the trend might not be enough for other data sets
plt.plot(x, y - trend, 'b.')
fig.show()

second order polynomial detrending without masking

Take note of the coefficients of the polynomial.

Second order polynomial detrending, selecting significant points

We can simply create new x and y arrays that contain only the wanted points. There is less potential for mistakes here.

This works in 3 steps. First we use a comparison operator on the array of interest, which results in a bool array with 'True' values at the indices where the comparison was true and 'False' values everywhere else.

Then we put the bool array through 'np.where()', which results in an array containing all the index numbers as values where the bool array had 'True' values, i. e. it's answering the question: "Where is my array truthy?"

And finally we peruse Numpy's advanced indexing and apply our result array of indices as index into both the x and y arrays, which filters out all unwanted indices.

import numpy as np
import matplotlib.pyplot as plt

data1 = np.loadtxt('lc1.out')
x, y = data1.T
select = np.where((x < 640.75) | (x > 641.25))
x_selection = x[select]  # numpy advanced indexing
y_selection = y[select]  # numpy advanced indexing

fig = plt.figure()

plt.subplot(2, 1, 1)
plt.title('polyfit, selecting significant points')
plt.plot(x_selection, y_selection, 'c.')

coeff = np.polyfit(x_selection, y_selection, 2)

# no need to use the original x values here just for visualizing the polynomial
x_poly = np.linspace(x_selection.min(), x_selection.max())
y_poly = np.polyval(coeff, x_poly)
plt.plot(x_poly, y_poly, 'r-', linewidth=3)

mid = len(x_poly) // 2
plt.annotate('y = {:.7g} x\xB2 + {:.7g} x + {:.7g}'.format(*coeff),
             (x_poly[mid], y_poly[mid]), (0, 48), textcoords='offset points',
             arrowprops={'arrowstyle': '->'}, horizontalalignment='center')

plt.subplot(2, 1, 2)
plt.title('detrended')

# we need the original x values here, so we can remove the trend from all points
trend = np.polyval(coeff, x)
# note that simply subtracting the trend might not be enough for other data sets
plt.plot(x, y - trend, 'b.')
fig.show()

second order polynomial detrending, selecting significant points

The coefficients are different now, as expected.

Second order polynomial detrending, masking unwanted points

Of course we can also use a masked array. Pay attention to the reversed logic: Masked points are those we don't want. As in the example data we don't want the points inside the interval, we use ma.masked_inside().

If for performance reasons we want to avoid creating a full copy of the original array, we can use the keyword copy=False. Making the original array read-only saves us from accidentally changing the values in the masked array by mutating the original array.

For masked arrays we need to use the version of the polyfit() function from the numpy.ma submodule, which properly ignores the unwanted values both in the masked version of x as well as in the unmasked companion array y. If we fail to do that, we get wrong results. Note that this is an easy mistake to make, so we may want to stick to the other method if we can help it.

import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt

data1 = np.loadtxt('lc1.out')
x, y = data1.T
x.flags.writeable = False  # safety measure, as we don't copy
x_masked = ma.masked_inside(x, 640.75, 641.25, copy=False)

fig = plt.figure()

plt.subplot(2, 1, 1)
plt.title('polyfit, masking unwanted points')
plt.plot(x_masked, y, 'c.')

coeff = ma.polyfit(x_masked, y, 2)

# no need to use the original x values here just for visualizing the polynomial
x_poly = np.linspace(x_masked.min(), x_masked.max())
y_poly = np.polyval(coeff, x_poly)
plt.plot(x_poly, y_poly, 'r-', linewidth=3)

mid = len(x_poly) // 2
plt.annotate('y = {:.7g} x\xB2 + {:.7g} x + {:.7g}'.format(*coeff),
             (x_poly[mid], y_poly[mid]), (0, 48), textcoords='offset points',
             arrowprops={'arrowstyle': '->'}, horizontalalignment='center')

plt.subplot(2, 1, 2)
plt.title('detrended')

# we need the original x values here, so we can remove the trend from all points
trend = np.polyval(coeff, x)
# note that simply subtracting the trend might not be enough for other data sets
plt.plot(x, y - trend, 'b.')
fig.show()

second order polynomial detrending, masking unwanted points

The coefficients are the same as in the other method, which is good. Had we mistakenly used np.polyfit() instead, we would have ended up with the same coefficients as in the unmasked reference.

like image 85
blubberdiblub Avatar answered May 19 '26 09:05

blubberdiblub