Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

find peaks location in a spectrum numpy

Tags:

python

numpy

I have a TOF spectrum and I would like to implement an algorithm using python (numpy) that finds all the maxima of the spectrum and returns the corresponding x values.
I have looked up online and I found the algorithm reported below.

The assumption here is that near the maximum the difference between the value before and the value at the maximum is bigger than a number DELTA. The problem is that my spectrum is composed of points equally distributed, even near the maximum, so that DELTA is never exceeded and the function peakdet returns an empty array.

Do you have any idea how to overcome this problem? I would really appreciate comments to understand better the code since I am quite new in python.

Thanks!

import sys
from numpy import NaN, Inf, arange, isscalar, asarray, array

def peakdet(v, delta, x = None): 
   maxtab = []
   mintab = []

   if x is None:
      x = arange(len(v))
      v = asarray(v)

   if len(v) != len(x):
      sys.exit('Input vectors v and x must have same length')
   if not isscalar(delta):
      sys.exit('Input argument delta must be a scalar')
   if delta <= 0:
      sys.exit('Input argument delta must be positive')

   mn, mx = Inf, -Inf
   mnpos, mxpos = NaN, NaN

   lookformax = True

   for i in arange(len(v)):
      this = v[i]
      if this > mx:
         mx = this
         mxpos = x[i]
      if this < mn:
         mn = this
         mnpos = x[i]

      if lookformax:
         if this < mx-delta:
            maxtab.append((mxpos, mx))
            mn = this
            mnpos = x[i]
            lookformax = False
      else:
         if this > mn+delta:
            mintab.append((mnpos, mn))
            mx = this
            mxpos = x[i]
            lookformax = True

return array(maxtab), array(mintab)

Below is shown part of the spectrum. I actually have more peaks than those shown here.

enter image description here

like image 707
diegus Avatar asked Jul 09 '14 14:07

diegus


People also ask

How do you find peaks?

A common requirement in scientific data processing is to detect peaks in a signal and to measure their positions, heights, widths, and/or areas. One way to do this is to make use of the fact that the first derivative of a peak has a downward-going zero-crossing at the peak maximum.

What does Find_peaks return?

pks = findpeaks( data ) returns a vector with the local maxima (peaks) of the input signal vector, data . A local peak is a data sample that is either larger than its two neighboring samples or is equal to Inf . The peaks are output in order of occurrence.

What is Peakutils?

This package provides utilities related to the detection of peaks on 1D data. Includes functions to estimate baselines, finding the indexes of peaks in the data and performing Gaussian fitting or centroid computation to further increase the resolution of the peak detection.


2 Answers

As of SciPy version 1.1, you can also use find_peaks:

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import find_peaks

np.random.seed(0)

Y = np.zeros(1000)

# insert @deinonychusaur's peaker function here

peaker(Y)

# make data noisy
Y = Y + 10e-4 * np.random.randn(len(Y))
# find_peaks gets the maxima, so we multiply our signal by -1
Y *= -1 
# get the actual peaks
peaks, _ = find_peaks(Y, height=0.002)
# multiply back for plotting purposes
Y *= -1
plt.plot(Y)
plt.plot(peaks, Y[peaks], "x")
plt.show()

This will plot (note that we use height=0.002 which will only find peaks higher than 0.002):

enter image description here

In addition to height, we can also set the minimal distance between two peaks. If you use distance=100, the plot then looks as follows:

enter image description here

You can use

peaks, _ = find_peaks(Y, height=0.002, distance=100)

in the code above.

like image 142
Cleb Avatar answered Sep 30 '22 08:09

Cleb


After looking at the answers and suggestions I decided to offer a solution I often use because it is straightforward and easier to tweak. It uses a sliding window and counts how many times a local peak appears as a maximum as window shifts along the x-axis. As @DrV suggested, no universal definition of "local maximum" exists, meaning that some tuning parameters are unavoidable. This function uses "window size" and "frequency" to fine tune the outcome. Window size is measured in number of data points of independent variable (x) and frequency counts how sensitive should peak detection be (also expressed as a number of data points; lower values of frequency produce more peaks and vice versa). The main function is here:

def peak_finder(x0, y0, window_size, peak_threshold):
    # extend x, y using window size
    y = numpy.concatenate([y0, numpy.repeat(y0[-1], window_size)])
    x = numpy.concatenate([x0, numpy.arange(x0[-1], x0[-1]+window_size)])
    local_max = numpy.zeros(len(x0))
    for ii in range(len(x0)):
        local_max[ii] = x[y[ii:(ii + window_size)].argmax() + ii]

    u, c = numpy.unique(local_max, return_counts=True)
    i_return = numpy.where(c>=peak_threshold)[0]
    return(list(zip(u[i_return], c[i_return])))

along with a snippet used to produce the figure shown below:

import numpy
from matplotlib import pyplot

def plot_case(axx, w_f):
    p = peak_finder(numpy.arange(0, len(Y)), -Y, w_f[0], w_f[1])
    r = .9*min(Y)/10
    axx.plot(Y)
    for ip in p:
        axx.text(ip[0], r + Y[int(ip[0])], int(ip[0]),
                 rotation=90, horizontalalignment='center')
    yL = pyplot.gca().get_ylim()
    axx.set_ylim([1.15*min(Y), yL[1]])
    axx.set_xlim([-50, 1100])
    axx.set_title(f'window: {w_f[0]}, count: {w_f[1]}', loc='left', fontsize=10)
    return(None)

window_frequency = {1:(15, 15), 2:(100, 100), 3:(100, 5)}
f, ax = pyplot.subplots(1, 3, sharey='row', figsize=(9, 4),
                        gridspec_kw = {'hspace':0, 'wspace':0, 'left':.08,
                                       'right':.99, 'top':.93, 'bottom':.06})
for k, v in window_frequency.items():
    plot_case(ax[k-1], v)

pyplot.show()

Three cases show parameter values that render (from left to right panel): code output(1) too many, (2) too few, and (3) an intermediate amount of peaks.

To generate Y data, I used the function @deinonychusaur gave above, and added some noise to it from @Cleb's answer.

I hope some might find this useful, but it's efficiency primarily depends on actual peak shapes and distances.

like image 26
Dusan Kojic Avatar answered Sep 30 '22 07:09

Dusan Kojic