Bulk loading point quadtree

Question

I have implemented a method for bulk loading a point quadtree. But for some inputs it doesn't work correctly, for example if there are many points that have the same x- or y-coordinate. An example dataset would be:

test = [(3, 1), (16, 1), (11, 4), (5, 4), (9, 6), (5, 10),
        (1, 15), (11, 5), (11, 15), (12, 16), (19, 17)]
tree = create(test)

The problem occurs at the points : (11,4),(11,5),(11,15) and (5,10),(5,4).

This is the create function:

def create(point_list, presorted=False):
    if not point_list:
        return QuadNode()

    if not presorted:
        point_list.sort(key=lambda p: [p[0],p[1]])

    median = len(point_list) >> 1

    relevantPoint = point_list[median]
    relevantYCoordinate = relevantPoint[1]

    node = QuadNode(data=relevantPoint)

    leftBins = point_list[:median]
    rightBins = point_list[median + 1:]

    nwBins = [bin for bin in leftBins if bin[1] >= relevantYCoordinate]
    swBins = [bin for bin in leftBins if bin[1] < relevantYCoordinate]

    neBins = [bin for bin in rightBins if bin[1] >= relevantYCoordinate]
    seBins = [bin for bin in rightBins if bin[1] < relevantYCoordinate]

    node.nwNode = create(nwBins, presorted=True)
    node.swNode = create(swBins, presorted=True)

    node.neNode = create(neBins, presorted=True)
    node.seNode = create(seBins, presorted=True)
    return node

and the QuadNode:

class QuadNode(object):
    def __init__(self, data=None, nwNode=None, neNode=None, swNode=None, seNode=None):
        self.data = data
        self.nwNode = nwNode
        self.neNode = neNode
        self.swNode = swNode
        self.seNode = seNode

I want to follow the rule for insertion ,deletion, etc.:

swNode point.x < parent.x and point.y < parent.y
seNode point.x >= parent.x and point.y < parent.y
nwNode point.x < parent.x and point.y >= parent.y
neNode point.x >= parent.x and point.y >= parent.y

Pierre Rust · Accepted Answer

Your method of choosing the middle is the right one (as explained in Finkel's original article Quadtrees: A data structure for retrieval on composite keys), but the way you build the sub-collections for the subtrees is wrong.

For example with this sorted list:

[(1, 1), (1, 2), (1, 3)]

the median is 1, 2 and, given your rules for boundaries, 1,1 must be in SE and 1,3 in NE.
In the original article SE and NW are 'open' and NW and SE are closed: 1,1 is in NW and 1,3 in SE. As you can see with this definition of the boundaries all the elements before the median will be in SE or NW and all the elements after the median will be in SW or NE. But that does not hold with your definition of the boundaries.

So either there is something wrong with the definition of your boundaries or you must must check each element of the list to make sure it ends up in the right area. For eaxmple :

relevantPoint = point_list[median]
node = QuadNode(data=relevantPoint)
del point_list[median]

nwBins = [(x,y) for x,y  in point_list if x < relevantPoint[0] and y >= relevantPoint[1]]
swBins = [(x,y) for x,y  in point_list if x < relevantPoint[0] and y < relevantPoint[1]]
seBins = [(x,y) for x,y  in point_list if x >= relevantPoint[0] and y <= relevantPoint[1]]
neBins = [(x,y) for x,y  in point_list if x <= relevantPoint[0] and y > relevantPoint[1]]

However that's quite ugly and does not ensure the tree will be balanced. I'd rather check the definition of the boundaries ....

Bulk loading point quadtree

Tags:

python

data-structures

quadtree

greedsin

1 Answers

Pierre Rust

Recent Activity

Donate For Us

Bulk loading point quadtree

Tags:

python

data-structures

quadtree

greedsin

1 Answers

Pierre Rust

Related questions

Recent Activity

Donate For Us