Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quad tree and Kd tree

I have a set of latitude and longitude for various locations and also know the latitude and longitude of my current location. I have to find out the nearest places from current location.

  • Which algorithm is best one from Kdtree and quadtree to find out the neighbour locations from the set of latitudes and longitudes?
  • What are the advantage of one over other?
  • How can we implement these to algorithms in C# for above purpose?
like image 628
Kesiya Abraham Avatar asked Mar 18 '17 11:03

Kesiya Abraham


People also ask

Is a kd-tree better than a quadtree?

I'd argue that in this case a kd-tree would do better than a quadtree, since when using a quadtree, for finding a nearest neighbor, the closest object might be placed right on the other side of a division between nodes.

What are quadtrees?

Quadtrees are trees used to efficiently store data of points on a two-dimensional space. In this tree, each node has at most four children. Divide the current two dimensional space into four boxes. If a box contains one or more points in it, create a child object, storing in it the two dimensional space of the box

What is quadtree compression?

Quadtree compression of an image step by step. A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are the two-dimensional analog of octrees and are most often used to partition a two-dimensional space by recursively subdividing it into four quadrants or regions.

What is a four quadrant tree used for?

Quadtrees are the two-dimensional analog of octrees and are most often used to partition a two-dimensional space by recursively subdividing it into four quadrants or regions. The data associated with a leaf cell varies by application, but the leaf cell represents a "unit of interesting spatial information".


2 Answers

Comparing spatial indexing techniques I would like to bring a 3rd one into our comparative study which is called Grid indexing. and in order to understand Quad-Tree I would like to go into Grid indexing first.

What is Grid indexing?

Grid indexing is a grid-base spatial indexing method in which the study area is divided into fixed size tiles (fixed dimensions) like chess board.

Using Grid index, every point in a tile are tagged with that tile number, so the Index table could provide you a tag for each point showing the tile which our number falls in.

RTree Tiles

Imagine a situation in which you need to find points in a given rectangle. This query is performed in two steps :

  1. Find the tiles that the rectangle is overlapping, and the points in tiles (first filter)
  2. Finding the candidate points in above step that actually lay in our rectangle. this need to be done accurately using points and rectangle coordinates.(second filter)

The first filter creates a set of candidates and prevents to test all points in our study area to be checked one after the other.

The second filter is the accurate check and uses rectangle coordinates to test the candidates.

R Tree, Rectangle Query

Now, Take a look at the tiles in above pictures, what happens if the tiles are very big or very Small?

When the tiles are too big, for example assume that you have a tile with equal size of your study area, which makes only one tile! so the first filter is practically useless, the whole processing load will be burden by the second filter. In this case the first filter is fast and the second filter is very slow.

Now imagine that the tiles are very small, in this case the first filter is very slow and practically it generates the answer it self, and the second filter is fast.

Determining the tile size is very important and affects the performance directly but what if you can not determine the best tile dimension? what if you you area has both spare and dense sub-areas? Here it is the time to use other spatial indexing mechanisms like R-Tree, KD-Tree or Quad-Tree!

What is Quad-Tree?

Quad Tree method starts with a big tile that covers whole study area,and divides it by two horizontal and vertical lines to have four equal area which are new tiles and then inspect each tile to see if it has more than a pre-defined threshold, points in it. in this case the tile will be divided into four equal parts using a horizontal and a vertical dividing lines,again. The process continues till there would be no more tile with number of points bigger than threshold, which is a recursive algorithm.

So in denser areas we have smaller tiles and big tiles when having spare points.

Quad-Tree

What is KD-Tree? In KD-Tree, we divide an area if it has more than a threshold points in it (other criterias can be used) dividing is done using a (K-1) dimension geometry, for example in a 3D-Tree we need a plane to divide the space, and in a 2D-Tree we need a line to divide the area. Dividing geometry is iterative and cyclic, for example in a 3D-Tree, the first splitting plane is a X axis aligned plane and the next dividing plane is Y axis aligned and the next is Z, the cycle continue for each space parts to become acceptable(satisfy the criteria)

The following picture shows a balanced KD-Tree that each dividing line is a median, that divides an area into two sub-area with approximately equal number of points.

2D-Tree

Conclusion : If you have a well-distributed points which is not the case when talking about structural features of earth in a map, cause they are random, but is acceptable when we plan to store a city roads network. I would go for a Grid indexing.

If you have a limited resource(i.e. Car navigation Systems) you need to implement KD-Tree or Quad-Tree. Each has its own pros and cons.

  1. Quad-Tree creates a lot of empty sub-tiles because each tile has to be divided into four parts even if the whole data of our tile can be fit in one quarter, so the rest of sub-tiles are considered redundant.(take a look at the Quad-Tree picture at above)
  2. Quad-Tree has much more easier index and can be implemented more easier. Accessing a tile with a Tile ID does not need a recursive function.
  3. In a 2 dimensional Kd-Tree, each node has just two children nodes or has no child at all, so searching through the KD-Tree is a binary search in nature.
  4. Updating Quad-Tree is much more easier than updating a balanced KD-Tree.

With the above descriptions I recommend to start with Quad-Tree

Here it is a sample code for quad-tree that intend to create 5000 random point.

#include<stdio.h>
#include<stdlib.h>
//Removed windows-specific header and functions

//-------------------------------------
// STRUCTURES
//-------------------------------------
struct Point
{
    int x;
    int y;
};


struct Node
{
    int posX;
    int posY;
    int width;
    int height;
    Node *child[4];         //Changed to Node *child[4] rather than Node ** child[4]
    Point pointArray[5000];
};
//-------------------------------------
// DEFINITIONS
//-------------------------------------

void BuildQuadTree(Node *n);
void PrintQuadTree(Node *n, int depth = 0);
void DeleteQuadTree(Node *n);
Node *BuildNode(Node *n, Node  *nParent, int index);

//-------------------------------------
// FUNCTIONS
//-------------------------------------

void setnode(Node *xy,int x, int y, int w, int h)
{
    int i;
    xy->posX = x;
    xy->posY = y;
    xy->width= w;
    xy->height= h;
    for(i=0;i<5000;i++)
    {
        xy->pointArray[i].x=560;
        xy->pointArray[i].y=560;
    }
    //Initialises child-nodes to NULL - better safe than sorry
    for (int i = 0; i < 4; i++)
        xy->child[i] = NULL;
}
int randn()
{
    int a;
    a=rand()%501;
    return a;
}

int pointArray_size(Node *n)
{
    int m = 0,i;
    for (i = 0;i<=5000; i++)
        if(n->pointArray[i].x <= 500 && n->pointArray[i].y <= 500)
            m++;
    return (m + 1);
}
//-------------------------------------
// MAIN
//-------------------------------------
int main()
{
    // Initialize the root node
    Node * rootNode = new Node;     //Initialised node
    int i, x[5000],y[5000];
    FILE *fp;
    setnode(rootNode,0, 0, 500, 500);


    // WRITE THE RANDOM POINT FILE  
    fp = fopen("POINT.C","w");

    if ( fp == NULL )
    {
        puts ( "Cannot open file" );
        exit(1);
    }
    for(i=0;i<5000;i++)
    {
        x[i]=randn();
        y[i]=randn();
        fprintf(fp,"%d,%d\n",x[i],y[i]);
    }
    fclose(fp);

    // READ THE RANDOM POINT FILE AND ASSIGN TO ROOT Node
    fp=fopen("POINT.C","r");
    for(i=0;i<5000;i++)
    {
        if(fscanf(fp,"%d,%d",&x[i],&y[i]) != EOF)
        {
            rootNode->pointArray[i].x=x[i];
            rootNode->pointArray[i].y=y[i];
        }
    }

    fclose(fp);

    // Create the quadTree
    BuildQuadTree(rootNode);
    PrintQuadTree(rootNode);    //Added function to print for easier debugging
    DeleteQuadTree(rootNode);

    return 0;
}

//-------------------------------------
// BUILD QUAD TREE
//-------------------------------------
void BuildQuadTree(Node *n)
{
    Node * nodeIn = new Node;   //Initialised node

    int points = pointArray_size(n);

    if(points > 100)
    {
        for(int k =0; k < 4; k++)
        {
            n->child[k] = new Node;     //Initialised node
            nodeIn = BuildNode(n->child[k], n, k);
            BuildQuadTree(nodeIn);
        }
    }
}
//-------------------------------------
// PRINT QUAD TREE
//-------------------------------------
void PrintQuadTree(Node *n, int depth)
{
    for (int i = 0; i < depth; i++)
        printf("\t");

    if (n->child[0] == NULL)
    {
        int points = pointArray_size(n);
        printf("Points: %d\n", points);
        return;
    }
    else if (n->child[0] != NULL)
    {
        printf("Children:\n");
        for (int i = 0; i < 4; i++)
            PrintQuadTree(n->child[i], depth + 1);
        return;
    }
}
//-------------------------------------
// DELETE QUAD TREE
//-------------------------------------
void DeleteQuadTree(Node *n)
{
    if (n->child[0] == NULL)
    {
        delete n;
        return;
    }
    else if (n->child[0] != NULL)
    {
        for (int i = 0; i < 4; i++)
            DeleteQuadTree(n->child[i]);
        return;
    }
}
//-------------------------------------
// BUILD NODE
//-------------------------------------
Node *BuildNode(Node *n, Node *nParent, int index)
{
    int numParentPoints, i,j = 0;

    // 1) Creates the bounding box for the node
    // 2) Determines which points lie within the box

    /*
     Position of the child node, based on index (0-3), is determined in this order:
     | 1 | 0 |
     | 2 | 3 |
     */

    setnode(n, 0, 0, 0, 0);

    switch(index)
    {
        case 0: // NE

            n->posX = nParent->posX+nParent->width/2;
            n->posY = nParent->posY+nParent->height/2;
            break;

        case 1: // NW

            n->posX = nParent->posX;
            n->posY = nParent->posY+nParent->height/2;
            break;

        case 2: // SW

            n->posX = nParent->posX;
            n->posY = nParent->posY;
            break;

        case 3: // SE

            n->posX = nParent->posX+nParent->width/2;
            n->posY = nParent->posY;
            break;

    }

    // Width and height of the child node is simply 1/2 of the parent node's width and height
    n->width = nParent->width/2;
    n->height = nParent->height/2;

    // The points within the child node are also based on the index, similiarily to the position
    numParentPoints = pointArray_size(nParent);

    switch(index)
    {
        case 0: // NE
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the top right quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX+nParent->width/2 && nParent->pointArray[i].y > nParent->posY + nParent->height/2 && nParent->pointArray[i].x <= nParent->posX + nParent->width && nParent->pointArray[i].y <= nParent->posY + nParent-> height)
                {
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            }
            break;
        case 1: // NW
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the top left quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX && nParent->pointArray[i].y > nParent->posY+ nParent-> height/2 && nParent->pointArray[i].x <= nParent->posX + nParent->width/2 && nParent->pointArray[i].y <= nParent->posY + nParent->height)
                {
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            } 
            break;
        case 2: // SW
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the bottom left quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX && nParent->pointArray[i].y > nParent->posY && nParent->pointArray[i].x <= nParent->posX + nParent->width/2 && nParent->pointArray[i].y <= nParent->posY + nParent->height/2)
                {   
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            }
            break;

        case 3: // SE
            for(i = 0; i < numParentPoints-1; i++)
            {
                // Check all parent points and determine if it is in the bottom right quadrant
                if(nParent->pointArray[i].x<=500 && nParent->pointArray[i].x > nParent->posX +  nParent->width/2 && nParent->pointArray[i].y > nParent->posY && nParent->pointArray[i].x <= nParent->posX + nParent->width && nParent->pointArray[i].y <= nParent->posY + nParent->height/2)
                {
                    // Add the point to the child node's point array
                    n->pointArray[j].x = nParent ->pointArray[i].x;
                    n->pointArray[j].y = nParent ->pointArray[i].y;
                    j++;
                }
            }
            break;

    }
    return n;
}
like image 191
Iman Avatar answered Sep 19 '22 22:09

Iman


I'd argue that in this case a kd-tree would do better than a quadtree, since when using a quadtree, for finding a nearest neighbor, the closest object might be placed right on the other side of a division between nodes. Kd-trees on the other hand, allows implementation of very efficient nearest-neighbor search, although insertion and removal will be more difficult, while maintaining a balanced tree.

like image 42
Nikolaj Dam Larsen Avatar answered Sep 20 '22 22:09

Nikolaj Dam Larsen