I have implemented a method for bulk loading a point quadtree. But for some inputs it doesn't work correctly, for example if there are many points that have the same x- or y-coordinate. An example dataset would be:
test = [(3, 1), (16, 1), (11, 4), (5, 4), (9, 6), (5, 10),
(1, 15), (11, 5), (11, 15), (12, 16), (19, 17)]
tree = create(test)
The problem occurs at the points : (11,4),(11,5),(11,15)
and (5,10),(5,4)
.
This is the create
function:
def create(point_list, presorted=False):
if not point_list:
return QuadNode()
if not presorted:
point_list.sort(key=lambda p: [p[0],p[1]])
median = len(point_list) >> 1
relevantPoint = point_list[median]
relevantYCoordinate = relevantPoint[1]
node = QuadNode(data=relevantPoint)
leftBins = point_list[:median]
rightBins = point_list[median + 1:]
nwBins = [bin for bin in leftBins if bin[1] >= relevantYCoordinate]
swBins = [bin for bin in leftBins if bin[1] < relevantYCoordinate]
neBins = [bin for bin in rightBins if bin[1] >= relevantYCoordinate]
seBins = [bin for bin in rightBins if bin[1] < relevantYCoordinate]
node.nwNode = create(nwBins, presorted=True)
node.swNode = create(swBins, presorted=True)
node.neNode = create(neBins, presorted=True)
node.seNode = create(seBins, presorted=True)
return node
and the QuadNode
:
class QuadNode(object):
def __init__(self, data=None, nwNode=None, neNode=None, swNode=None, seNode=None):
self.data = data
self.nwNode = nwNode
self.neNode = neNode
self.swNode = swNode
self.seNode = seNode
I want to follow the rule for insertion ,deletion, etc.:
swNode
point.x < parent.x
and point.y < parent.y
seNode
point.x >= parent.x
and point.y < parent.y
nwNode
point.x < parent.x
and point.y >= parent.y
neNode
point.x >= parent.x
and point.y >= parent.y
Your method of choosing the middle is the right one (as explained in Finkel's original article Quadtrees: A data structure for retrieval on composite keys), but the way you build the sub-collections for the subtrees is wrong.
For example with this sorted list:
[(1, 1), (1, 2), (1, 3)]
the median is 1, 2
and, given your rules for boundaries, 1,1
must be in SE and 1,3
in NE.
In the original article SE and NW are 'open' and NW and SE are closed: 1,1
is in NW and 1,3
in SE. As you can see with this definition of the boundaries all the elements before the median will be in SE or NW and all the elements after the median will be in SW or NE. But that does not hold with your definition of the boundaries.
So either there is something wrong with the definition of your boundaries or you must must check each element of the list to make sure it ends up in the right area. For eaxmple :
relevantPoint = point_list[median]
node = QuadNode(data=relevantPoint)
del point_list[median]
nwBins = [(x,y) for x,y in point_list if x < relevantPoint[0] and y >= relevantPoint[1]]
swBins = [(x,y) for x,y in point_list if x < relevantPoint[0] and y < relevantPoint[1]]
seBins = [(x,y) for x,y in point_list if x >= relevantPoint[0] and y <= relevantPoint[1]]
neBins = [(x,y) for x,y in point_list if x <= relevantPoint[0] and y > relevantPoint[1]]
However that's quite ugly and does not ensure the tree will be balanced. I'd rather check the definition of the boundaries ....
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With