Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Array algorithm too slow

i have written an algorithm that finds each line of a hexagon in a huge object-array structure.

the array consits of about 80.000 - 100.000 elements (line coordinates from start to end).

A hexagon cosists of 6 lines points. So the array has the information of about 15.000 hexagons.

The structure of the object (UNSORTED!!!) looks like this:

const stamps = [
  { 
    vertices: [
      {x: 114.5116411118, y: 342.9815785601},
      {x: 115.6663416502, y: 344.9815785601}
    ]
  },
  {
    vertices: [
      {x: 115.6663416502, y: 340.9815785601},
      {x: 114.5116411118, y: 342.9815785601}
    ]
  },
  {
    vertices: [
      {x: 122.6663416502, y: 364.9815785601},
      {x: 147.9757427269, y: 314.9815785601},
    ]
  },
  {
    vertices: [
      {x: 117.9757427269, y: 340.9815785601},
      {x: 115.6663416502, y: 340.9815785601},
    ]
  },
  {
    vertices: [
      {x: 119.1304432653, y: 342.9815785601},
      {x: 117.9757427269, y: 340.9815785601},
    ]
  },
  {
    vertices: [
      {x: 117.9757427269, y: 344.9815785601},
      {x: 119.1304432653, y: 342.9815785601},
    ]
  },
  {
    vertices: [
      {x: 115.6663416502, y: 344.9815785601},
      {x: 117.9757427269, y: 344.9815785601},
    ]
  },
];

To find each line hexagon, my idea was that there has to be 2 elements that have to same coordinate. If this is the case, i'm jumping to the index of this element and repeat that process untill i have all 6 lines of the hexagon.

It works like this, but its really, really slow. For an array with 80.000 elements its about 3 minutes.

The algorithm:

function findHexPolyPoints() {
  const hexCoordinates = [];
  let activeArrayPos = 0;
  let position = 0;
  while (1) {
    let foundPair = false;
    if (stamps.length < 6) break;
    for (let k = 0; k < stamps.length; ++k) {
      if (k === position) continue;
      if (stamps[position].vertices[0].x === stamps[k].vertices[1].x && stamps[position].vertices[0].y === stamps[k].vertices[1].y) {
        if (hexCoordinates[activeArrayPos]) {
          hexCoordinates[activeArrayPos].push(stamps[k].vertices[0].x, stamps[k].vertices[0].y);
        } else {
          hexCoordinates.push([stamps[position].vertices[0].x, stamps[position].vertices[0].y, stamps[k].vertices[0].x, stamps[k].vertices[0].y]);
        }
        foundPair = true;
      } else if (stamps[position].vertices[1].x === stamps[k].vertices[0].x && stamps[position].vertices[1].y === stamps[k].vertices[0].y) {
        if (hexCoordinates[activeArrayPos]) {
          hexCoordinates[activeArrayPos].push(stamps[k].vertices[1].x, stamps[k].vertices[1].y);
        } else {
          hexCoordinates.push([stamps[position].vertices[1].x, stamps[position].vertices[1].y, stamps[k].vertices[1].x, stamps[k].vertices[1].y]);
        }
        foundPair = true;
      }
      if (foundPair) {
        stamps.splice(position, 1);
        if (k > position) {
          position = k - 1;
        } else {
          position = k;
        }
        if (hexCoordinates[activeArrayPos].length < 12) break;
      }
      if (hexCoordinates[activeArrayPos] && hexCoordinates[activeArrayPos].length === 12) {
        if (k > position) stamps.splice(k - 1, 1);
        else stamps.splice(k, 1);
        activeArrayPos += 1;
        position = 0;
        break;
      }
      if (k === stamps.length - 1) {
        stamps.splice(position, 1);
        break;
      }
    }
  }
  sortHexagons(hexCoordinates);
}

Is there any way to speed up my algorithm? I have read that a simple for loop is still faster that some js sort functions like .map .filter or similar.

like image 537
David Avatar asked Mar 03 '23 03:03

David


2 Answers

The following O(n) algorithm assumes

  1. two different hexagons do not have overlapping vertices
  2. there are no more than two same vertices in the array (meaning, there could be an orphan that doesn't belong to any hexagon, but its coordinates should not equal any of the hexagons vertices)
  3. there is no floating point inaccuracy in coordinates (meaning two vertices that are supposed to be equal, are === exactly equal)
  4. 6 connected vertices are assumed to form an hexagon... (there is no barycenter calculation and checks to ensure it's actually an hexagon)

(if the points 1. and 2. are not true, the algo would need to be worked more, to try all possibilities (in overt[x_y] array, see below) in order to avoid non-hexagon or overlapping coords, depending on the expectancy to find overlapping hexagons, or orphans, the complexity could go beyond O(n))

Using the concept of map (get an object value from a key), which is considered O(1).

In order to conveniently use the vertices coordinates, we can concatenate x and y into one string

x:123.456, y:789.321

gives

x_y = "123.456_789.321"

Let's create 3 variables avert = [], overt = {}, hexas = []

  • avert is an array of all vertices, avert[index] is an array(2) of the x_y vertices coordinates
  • overt is an object that, for each x_y coordinates gives an array of indexes in avert (the size should not become more than 2, as assumed above (and there is no >2 check))
  • hexas is an array of array(6), the list of hexagons found (each coordinate is formatted x_y)

In the first forEach, avert and overt are created.

The next forEach processes all avert vertices [x_y1, x_y2]

  • starting from the first vertex, tries to find the 6 points to form an hexagon
  • adds each vertex to one hexagon array hexa, starting from the next (after the first)
  • assume coordinates are not sorted, thus ensure we're not going back to previous vertice
  • skips the used vertices (in hexagons)
  • ensure the last vertex found has the same coordinates as the origin (first)

Initialization

let avert = [], overt = {}, hexas = [];

stamps.forEach(function(e, i){
    let xy1 = e['vertices'][0]['x']+'_'+e['vertices'][0]['y'];
    let xy2 = e['vertices'][1]['x']+'_'+e['vertices'][1]['y'];
    // overt[XY] (array) should have two elements at most (no overlapping),
    // one if orphan
    if ( ! overt[xy1]) overt[xy1] = [];
    overt[xy1].push( i );
    if ( ! overt[xy2]) overt[xy2] = [];
    overt[xy2].push( i );
    avert.push([ xy1, xy2 ]);
});

Processing

avert.forEach(function (e){
    let j,coord = e[0];   // first coords x_y
    let origin = coord;
    let hexa = [];
    let lastindex = -1;   // avoid going back!
    // try to find 5 connected vertices + origin
    for(j=0 ; j<6 ; j++) {
        let o = overt[coord];
        if ( o === undefined || o.length < 2 ) {
           break; // not found(already processed), or orphan!
        }
        let index = o[0] == lastindex ? o[1] : o[0];  // no turn back!
        lastindex = index;
        coord = avert[index][0] === coord ? avert[index][1] : avert[index][0];
        hexa.push(coord);
    }
    if (j >= 6) { // found all vertices
        // check that the last coord is the starting point
        if (hexa[5] === origin) { // got it
             hexas.push( hexa );
             hexa.forEach(function(h){ // delete used vertices
                delete overt[h];
             });
        }
    }
});

All hexagons should be in hexas

like image 91
Déjà vu Avatar answered Mar 05 '23 17:03

Déjà vu


You can avoid the nested loop over all data by using a hash map. Key the individual vertices with a hash, for instance their JSON representation, and store the corresponding x, y coordinate together with a list of neigbor objects.

Once you have that, it is easy to walk through that graph and identify the hexagons.

Runnable snippet using the sample data you provided:

function findHexPolyPoints(stamps) {
    // Create graph
    let map = new Map;
    for (let {vertices} of stamps) {
        // Get unique identifier for each vertex (its JSON notation)
        let keys = vertices.map(JSON.stringify);
        // Create "nodes" for each vertex, keyed by their key
        let nodes = keys.map(function (key, i) {
            let {x, y} = vertices[i];
            let node = map.get(key);
            if (!node) map.set(key, node = { key, x, y, neighbors: [] });
            return node;
        });
        // Link these two nodes in both directions
        nodes[0].neighbors.push(nodes[1]);
        nodes[1].neighbors.push(nodes[0]);
    }
    
    // Walk through the graph to detect and collect hexagons
    let hexagons = [];
    for (let [key, vertex] of map) {
        let hexagon = [];
        while (vertex && hexagon.push(vertex) < 6) {
            vertex = vertex.neighbors.find(neighbor => !hexagon.includes(neighbor));
        }
        // Remove collected nodes so they don't get visited a second time
        for (let {key} of hexagon) map.delete(key);
        // Verify that they form indeed a hexagon:
        if (vertex && vertex.neighbors.includes(hexagon[0])) {
            // Simplify the hexagon to only coordinates (12 coordinates)
            hexagons.push(hexagon.flatMap(({x, y}) => [x, y]));
        }
    }
    return hexagons;
}

// Demo. Just replace `stamps` with your actual data.
const stamps = [{vertices: [{x: 114.5116411118, y: 342.9815785601},{x: 115.6663416502, y: 344.9815785601}]},{vertices: [{x: 115.6663416502, y: 340.9815785601},{x: 114.5116411118, y: 342.9815785601}]},{vertices: [{x: 122.6663416502, y: 364.9815785601},{x: 147.9757427269, y: 314.9815785601},]},{vertices: [{x: 117.9757427269, y: 340.9815785601},{x: 115.6663416502, y: 340.9815785601},]},{vertices: [{x: 119.1304432653, y: 342.9815785601},{x: 117.9757427269, y: 340.9815785601},]},{vertices: [{x: 117.9757427269, y: 344.9815785601},{x: 119.1304432653, y: 342.9815785601},]},{vertices: [{x: 115.6663416502, y: 344.9815785601},{x: 117.9757427269, y: 344.9815785601},]},];
let hexagons = findHexPolyPoints(stamps);
console.log(hexagons);

It is true that plain old for loops are somewhat faster than .map, .forEach, .reduce, .find and the likes, but here I kept using them, as the main speed up is really coming from using a hash map.

like image 41
trincot Avatar answered Mar 05 '23 17:03

trincot