Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Persistent (Disk Based) R-Tree (or R* Tree)

How can R* Tree be implemented as a persistent (disk based) one? What is the architecture of the file for saving the R* Tree index or for saving leaf values?

Notes: In addition how insert, update and delete operations can be performed in such a persistent R* Tree?

Notes II: I have implemented an in-memory R-Tree with bulk load functionality. But I think that is totally irrelevant when we speak about disk-based ones.

like image 255
Kaveh Shahbazian Avatar asked Nov 28 '12 07:11

Kaveh Shahbazian


1 Answers

Architecture of the file

Well, it's pages (= blocks). The pages should have a multiple of the page size of the underlying storage, so probably 1kb or 8kb blocks. Each block has a number and can be references this way.

Directory pages store the bounding boxes of the children and their page numbers.

Child pages store the actual data objects.

Managing the tree

Well, in theory: whenever you modify a page in memory, you write the changes to disk. That's it.

In practise, you might want to use a cache for performance, and you might want to have transactions for keeping your tree consistent in case of an application crash.

On both of these things you can find a lot of literature in the field of RDBMS architecture.

A key benefit of the R*-tree is that it is a regular page-oriented tree as you would have them in database systems all over the place. If you have a good on disk B+-tree implementation, you can reuse most of your code for the R*-tree.

How to get started

To get started, you need to get used to disk-based data indexing, as it is done in classic RDBMS. I'd suggest starting with an on disk B-tree or B+-tree. Do allow deletions, because you need to think about managing deleted pages and all this.

Once you figured out the B-Tree on disk (and maybe spend some time on optimizing it!), doing an on-disk R-tree should be fairly obvious.

I havn't looked at the code, but this might be a good starting point: http://www.die-schoens.de/prg/ or some of the others linked in Looking for a disk-based B+ tree implementation in C++ or C

like image 166
Has QUIT--Anony-Mousse Avatar answered Oct 05 '22 08:10

Has QUIT--Anony-Mousse