Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Collapse consecutive 'same' elements in array

Tags:

arrays

ruby

Goal

Given an array of elements and criteria to determine if two elements are the 'same', return a new array where runs of consecutive 'same' elements have been removed to leave only the end points. For example:

a = [ {k:'a',v:1}, {k:'b',v:1}, {k:'c',v:1},
      {k:'d',v:2}, {k:'e',v:2},
      {k:'f',v:3}, {k:'g',v:3}, {k:'h',v:3}, {k:'i',v:3}, {k:'j',v:3},
      {k:'k',v:2},
      {k:'l',v:4}, {k:'m',v:4}, {k:'n',v:4}, {k:'o',v:4} ]
b = a.collapse_consecutive{ |h| h[:v] }
#=> [ {k:'a',v:1}, {k:'c',v:1},
#=>   {k:'d',v:2}, {k:'e',v:2},
#=>   {k:'f',v:3}, {k:'j',v:3},
#=>   {k:'k',v:2},
#=>   {k:'l',v:4}, {k:'o',v:4} ]

Motivation

When plotting n points on a line graph a series of consecutive same-valued results have no effect on the graph except at the end points. In the graph below the black samples have no effect on the final graph. I am storing finely-sampled graphs and would ideally like to remove all irrelevant samples.

Line graph with inflection points represented by orange dots and several points along linear sections represented by black dots (on both horizontal and angled linear sections)

For this question I am simplifying the problem to only remove the black dots on horizontal sections, since identifying points that fall along angled linear sections is (a) harder and (b) more rare (in my case).

Current Progress

The best that I've come up with so far is this solution that relies upon array indexing:

class Array
  def collapse_consecutive
    select.with_index{ |o,i|
      i==0 || yield(self[i-1])!=yield(o) ||
      !self[i+1] || yield(self[i+1])!=yield(o)
    end
  end
end

This works, but relying upon array indexing in Ruby is usually "code smell": an indication that there is a more elegant implementation available.

like image 612
Phrogz Avatar asked Oct 13 '25 06:10

Phrogz


1 Answers

Here's a general solution for simplifying any 2D line, with a custom threshold for how big an angle should be before removing it. Assumes that X and Y values are displayed using the same scale.

# Assumes that points is an array of two-valued arrays representing
# [x,y] pairs; removes points whose angle is less than the threshold
def simplify_line(points,inflection_threshold_degrees=1)
  points.reject.with_index do |p1,i|
    if i>0 && (p0=points[i-1]) && (p2=points[i+1])
      # http://stackoverflow.com/a/7505937/405017
      p0p1 = (p1[0]-p0[0])**2 + (p1[1]-p0[1])**2
      p2p1 = (p1[0]-p2[0])**2 + (p1[1]-p2[1])**2
      p0p2 = (p2[0]-p0[0])**2 + (p2[1]-p0[1])**2
      angle = Math.acos( (p2p1+p0p1-p0p2) / Math.sqrt(4*p2p1*p0p1) )*180/Math::PI
      (180 - angle).abs < inflection_threshold_degrees
    end
  end
end

Used with the points from the graph displayed in the question above:

pts = [ [  0,40],[ 20,80],[ 40,90],[ 60,80],[ 80,80],[100,60],[120,60],
        [140,60],[160,60],[180,60],[200,80],[220,80],[240,80],[260,60],
        [280,40],[300,33],[320,27],[340,20],[360,20] ]

and we get good results:

Line graphs with points removed between 10° and 60°, showing increasingly simplified versions that retain the end points, ultimately resulting in a single line.

like image 173
Phrogz Avatar answered Oct 15 '25 11:10

Phrogz