Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identify important minima and maxima in time-series w/ Mathematica

I need a way to identify local minima and maxima in time series data with Mathematica. This seems like it should be an easy thing to do, but it gets tricky. I posted this on the MathForum, but thought I might get some additional eyes on it here.

You can find a paper that discusses the problem at: http://www.cs.cmu.edu/~eugene/research/full/compress-series.pdf

I've tried this so far…

Get and format some data:

data = FinancialData["SPY", {"May 1, 2006", "Jan. 21, 2011"}][[All, 2]];
data = data/First@data;
data = Transpose[{Range[Length@data], data}];

Define 2 functions:

First method:

findMinimaMaxima[data_, window_] := With[{k = window},
  data[[k + Flatten@Position[Partition[data[[All, 2]], 2 k + 1, 1],  x_List /;  x[[k + 1]] < Min[Delete[x, k + 1]] || x[[k + 1]] > Max[Delete[x, k + 1]]]]]]

Now another approach, although not as flexible:

findMinimaMaxima2[data_] := data[[Accumulate@(Length[#] & /@ Split[Prepend[Sign[Rest@data[[All, 2]] - Most@data[[All, 2]]], 0]])]]

Look at what each the functions does. First findMinimaMaxima2[]:

minmax = findMinimaMaxima2[data];
{Length@data, Length@minmax}
ListLinePlot@minmax

This selects all minima and maxima and results (in this instance) in about a 49% data compression, but it doesn't have the flexibility of expanding the window. This other method does. A window of 2, yields fewer and arguably more important extrema:

minmax2 = findMinimaMaxima[data, 2];
{Length@data, Length@minmax2}
ListLinePlot@minmax2

But look at what happens when we expand the window to 60:

minmax2 = findMinimaMaxima[data, 60];
ListLinePlot[{data, minmax2}]

Some of the minima and maxima no longer alternate. Applying findMinimaMaxima2[] to the output of findMinimaMaxima[] gives a workaround...

minmax3 = findMinimaMaxima2[minmax2];
ListLinePlot[{data, minmax2, minmax3}]

, but this seems like a clumsy way to address the problem.

So, the idea of using a fixed window to look left and right doesn't quite do everything one would like. I began thinking about an alternative that could use a range value R (e.g. a percent move up or down) that the function would need to meet or exceed to set the next minima or maxima. Here's my first try:

findMinimaMaxima3[data_, R_] := Module[{d, n, positions},
  d = data[[All, 2]];
  n = Transpose[{data[[All, 1]], Rest@FoldList[If[(#2 <= #1 + #1*R && #2 >= #1) || (#2 >= #1 - #1* R && #2 <= #1), #1, #2] &, d[[1]], d]}];
  n = Sign[Rest@n[[All, 2]] - Most@n[[All, 2]]];
  positions = Flatten@Rest[Most[Position[n, Except[0]]]];
  data[[positions]]
  ]

minmax4 = findMinimaMaxima3[data, 0.1];
ListLinePlot[{data, minmax4}]

This too benefits from post processing with findMinimaMaxima2[]

ListLinePlot[{data, findMinimaMaxima2[minmax4]}]

But if you look closely, you see that it misses the extremes if they go beyond the R value in several positions - including the chart's absolute minimum and maximum as well as along the big moves up and down. Changing the R value shows how it misses the top and bottoms even more:

minmax4 = findMinimaMaxima3[data, 0.15];
ListLinePlot[{data, minmax4}]

So, I need to reconsider. Anyone can look at a plot of the data and easily identify the important minima and maxima. It seems harder to get an algorithm to do it. A window and/or an R value seem important to the solution, but neither on their own seems enough (at least not in the approaches above).

Can anyone extend any of the approaches shown or suggest an alternative to identifying the important minima and maxima?

Happy to forward a notebook with all of this code and discussion in it. Let me know if anyone needs it.

Thank you, Jagra

like image 687
Jagra Avatar asked Jan 26 '11 19:01

Jagra


1 Answers

I suggest to use an iterative approach. The following functions are taken from this post, and while they can be written more concisely without Compile, they'll do the job:

localMinPositionsC = 
 Compile[{{pts, _Real, 1}}, 
   Module[{result = Table[0, {Length[pts]}], i = 1, ctr = 0}, 
    For[i = 2, i < Length[pts], i++, 
     If[pts[[i - 1]] > pts[[i]] && pts[[i + 1]] > pts[[i]], 
      result[[++ctr]] = i]];
    Take[result, ctr]]];

localMaxPositionsC = 
  Compile[{{pts, _Real, 1}}, 
    Module[{result = Table[0, {Length[pts]}], i = 1, ctr = 0}, 
      For[i = 2, i < Length[pts], i++, 
        If[pts[[i - 1]] < pts[[i]] && pts[[i + 1]] < pts[[i]], 
          result[[++ctr]] = i]];
       Take[result, ctr]]];

Here is your data plot:

dplot = ListLinePlot[data]

Here we plot the mins, which are obtained after 3 iterations:

mins = ListPlot[Nest[#[[localMinPositionsC[#[[All, 2]]]]] &, data, 3],
   PlotStyle -> Directive[PointSize[0.015], Red]]

The same for maxima:

maxs = ListPlot[Nest[#[[localMaxPositionsC[#[[All, 2]]]]] &, data, 3],
   PlotStyle -> Directive[PointSize[0.015], Green]]

And the resulting plot:

Show[{dplot, mins, maxs}]

enter image description here

You may vary the number of iterations, to get more coarse-grained or finer minima/maxima.

Edit:

actually, I just noticed that a couple of points were still missed by this method, both for the minima and maxima. So, I suggest it as a starting point, not as a complete solution. Perhaps, you could analyze minima/maxima, coming from different iterations, and sometimes include those from a "previous", more fine-grained one. Also, the only "physical reason" that this kind of works, is that the nature of the financial data appears to be fractal-like, with several distinctly different scales. Each iteration in the above Nest-s targets a particular scale. This would not work so well for an arbitrary signal.

like image 199
Leonid Shifrin Avatar answered Sep 18 '22 04:09

Leonid Shifrin