I know implementing database is a huge topic, but I want to have a basic understanding of how database systems work (e.g. memory management, binary tree, transaction, sql parsing, multi-threading, partitions, etc) by investigating the source code of the database.
Since there are a few already proven very robust open source databases like mysql, sqlite and so on. However, the code are very complicated and I have no clue where to start. Also I find that the old school database textbooks are only explaining the theory, not the implementation details.
Can anyone suggest how I should get started and if there are any books that emphasis on the technology and techniques of building dbms used in modern database industry?
MySQL uses a database server to run on a network, which can then be accessed by the client. SQLite, however, is what is known as an embedded database. This means that the structure is stored on the application itself.
SQLite is open-source, meaning that you can make as many copies of it as you want and do whatever you want with those copies, without limitation.
You should learn them together since to learn SQL you'll need an SQL engine, and SQLite is just that. Note that SQLite doesn't implement all of the SQL language but it's a great place to start learning it due to the library's simplicity.
MySQL has a well-constructed user management system which can handle multiple users and grant various levels of permission. SQLite is suitable for smaller databases. As the database grows the memory requirement also gets larger while using SQLite. Performance optimization is harder when using SQLite.
I hate to sound like an grumpy, old academic, but the theory really is what you need to study, if you are determined to build your own RDBMS. The implementation details are really just, erh, implementation details. Apart from textbooks, you might also want to study research papers, which tend to cover the subject in higher detail.
When you start implementing your database engine, you could look into existing open-source implementations, but do expect the learning curve to be steep. As you have already discovered, these projects tend to be quite complex. When you have concrete questions to those projects, try posting them on the relevant mailinglists. When you have concrete questions about your own implementation, post them here :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With