Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using PatternSequence with Cases in Mathematica to find peaks

Given pairs of coordinates

data = {{1, 0}, {2, 0}, {3, 1}, {4, 2}, {5, 1}, 
        {6, 2}, {7, 3}, {8, 4}, {9, 3}, {10, 2}}

I'd like to extract peaks and valleys, thus:

{{4, 2}, {5, 1}, {8, 4}}

My current solution is this clumsiness:

Cases[
 Partition[data, 3, 1],
 {{ta_, a_}, {tb_, b_}, {tc_, c_}} /; Or[a < b > c, a > b < c] :> {tb, b}
]

which you can see starts out by tripling the size of the data set using Partition. I think it's possible to use Cases and PatternSequence to extract this information, but this attempt doesn't work:

Cases[
 data,
 ({___, PatternSequence[{_, a_}, {t_, b_}, {_, c_}], ___} 
         /; Or[a < b > c, a > b < c]) :> {t, b}
]

That yields {}.

I don't think anything is wrong with the pattern because it works with ReplaceAll:

data /. ({___, PatternSequence[{_, a_}, {t_, b_}, {_, c_}], ___} 
             /; Or[a < b > c, a > b < c]) :> {t, b}

That gives the correct first peak, {4, 2}. What's going on here?

like image 211
ArgentoSapiens Avatar asked Sep 02 '11 17:09

ArgentoSapiens


5 Answers

One of the reasons why your failed attempt doesn't work is that Cases by default looks for matches on level 1 of your expression. Since your looking for matches on level 0 you would need to do something like

Cases[
 data,
 {___, {_, a_}, {t_, b_}, {_, c_}, ___} /; Or[a < b > c, a > b < c] :> {t, b}, 
 {0}
]

However, this only returns {4,2} as a solution so it's still not what you're looking for. To find all matches without partitioning you could do something like

ReplaceList[data, ({___, {_, a_}, {t_, b_}, {_, c_}, ___} /; 
    Or[a < b > c, a > b < c]) :> {t, b}]

which returns

{{4, 2}, {5, 1}, {8, 4}}
like image 53
Heike Avatar answered Nov 12 '22 09:11

Heike


Your "clumsy" solution is fairly fast, because it heavily restricts what gets looked at.

Here is an example.

m = 10^4;
n = 10^6;

ll = Transpose[{Range[n], RandomInteger[m, n]}];

In[266]:= 
Timing[extrema = 
    Cases[Partition[ll, 3, 
      1], {{ta_, a_}, {tb_, b_}, {tc_, c_}} /; 
       Or[a < b > c, a > b < c] :> {tb, b}];][[1]]

Out[266]= 3.88

In[267]:= Length[extrema]

Out[267]= 666463

This seems to be faster than using replacement rules.

Faster still is to create a sign table of products of differences. Then pick entries not on the ends of the list that correspond to sign products of 1.

In[268]:= Timing[ordinates = ll[[All, 2]];
  signs = 
   Table[Sign[(ordinates[[j + 1]] - 
        ordinates[[j]])*(ordinates[[j - 1]] - ordinates[[j]])], {j, 2,
      Length[ll] - 1}];
  extrema2 = Pick[ll[[2 ;; -2]], signs, 1];][[1]]

Out[268]= 0.23

In[269]:= extrema2 === extrema

Out[269]= True

Handling of consecutive equal ordinates is not considered in these methods. Doing that would take more work since one must consider neighborhoods larger than three consecutive elements. (My spell checker wants me to add a 'u' to the middle syllable of "neighborhoods". My spell checker must think we are in Canada.)

Daniel Lichtblau

like image 31
Daniel Lichtblau Avatar answered Nov 12 '22 11:11

Daniel Lichtblau


Another alternative:

Part[#,Flatten[Position[Differences[Sign[Differences[#[[All, 2]]]]], 2|-2] + 1]] &@data

(* ==> {{4, 2}, {5, 1}, {8, 4}} *)

Extract[#, Position[Differences[Sign[Differences[#]]], {_, 2} | {_, -2}] + 1] &@data

(* ==> {{4, 2}, {5, 1}, {8, 4}} *)
like image 2
Sjoerd C. de Vries Avatar answered Nov 12 '22 10:11

Sjoerd C. de Vries


This may be not exactly the implementation you ask, but along those lines:

ClearAll[localMaxPositions];
localMaxPositions[lst : {___?NumericQ}] := 
  Part[#, All, 2] &@
     ReplaceList[
        MapIndexed[List, lst], 
        {___, {x_, _}, y : {t_, _} .., {z_, _}, ___} /; x < t && z < t :> y];

Example:

In[2]:= test = RandomInteger[{1,20},30]
Out[2]= {13,9,5,9,3,20,2,5,18,13,2,20,13,12,4,7,16,14,8,16,19,20,5,18,3,15,8,8,12,9}

In[3]:= localMaxPositions[test]
Out[3]= {{4},{6},{9},{12},{17},{22},{24},{26},{29}}

Once you have positions, you may extract the elements:

In[4]:= Extract[test,%]
Out[4]= {9,20,18,20,16,20,18,15,12}

Note that this will also work for plateau-s where you have more than one same maximal element in a row. To get minima, one needs to trivially change the code. I actually think that ReplaceList is a better choice than Cases here.

To use it with your data:

In[7]:= Extract[data,localMaxPositions[data[[All,2]]]]
Out[7]= {{4,2},{8,4}}

and the same for the minima. If you want to combine, the change in the above rule is also trivial.

like image 2
Leonid Shifrin Avatar answered Nov 12 '22 10:11

Leonid Shifrin


Since one of your primary concerns about your "clumsy" method is the data expansion that takes place with Partition, you may care to know about the Developer` function PartitionMap, which does not partition all the data at once. I use Sequence[] to delete the elements that I don't want.

Developer`PartitionMap[
  # /. {{{_, a_}, x : {_, b_}, {_, c_}} /; a < b > c || a > b < c :> x,
        _ :> Sequence[]} &,
  data, 3, 1
]
like image 1
Mr.Wizard Avatar answered Nov 12 '22 11:11

Mr.Wizard