Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does MySQL store rows on disk? [duplicate]

I looked around Google but didn't find any good answers. Does it store the data in one big file? What methods does it use to make data access quicker than just reading and writing to a regular file?

like image 569
Tallboy Avatar asked Apr 30 '12 04:04

Tallboy


4 Answers

This question is a bit old but I decided to answer it anyway since I have been doing some digging on the same. My answer is based on the linux file system. Basically mySQL stores data in files in your hard disk. It stores the files in a specific directory that has the system variable "datadir". Opening a mysql console and running the following command will tell you exactly where the folder is located.

mysql>  SHOW VARIABLES LIKE 'datadir';
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| datadir       | /var/lib/mysql/ |
+---------------+-----------------+
1 row in set (0.01 sec)

As you can see from the above command, my "datadir" was located in /var/lib/mysql/. The location of the "datadir" may vary in different systems. The directory contains folders and some configuration files. Each folder represents a mysql database and contains files with data for that specific database. below is a screenshot of the "datadir" directory in my system.

sample "datadir" directory screenshot

Each folder in the directory represents a MySQL database. Each database folder contains files that represent the tables in that database. There are two files for each table, one with a .frm extension and the other with a .idb extension. See screenshot below.

sample MySQL database table screenshot

The .frm table file stores the table's format. Details: MySQL .frm File Format

The .ibd file stores the table's data. Details: InnoDB File-Per-Table Tablespaces

That’s it folks! I hope I helped someone.

like image 97
Benjamin Gakami Avatar answered Oct 28 '22 08:10

Benjamin Gakami


Does it store the data in one big file?

Some DBMSes store the whole database in a single file, some split tables, indexes and other object kinds to separate files, some split files not by object kind but by some storage/size criteria, some can even entirely bypass the file system, etc etc...

I don't know which one of these strategies MySQL uses (it probably depends on whether you use MyISAM vs. InnoDB etc.), but fortunately, it doesn't matter: from the client perspective, this is a DBMS implementation detail the client should rarely worry about.

What methods does it use to make data access quicker them just reading and writing to a regular file?

First of all, DBMses are not just about performance:

  • They are even more about safety of your data - they have to ensure there is no data corruption even in the face of a power cut or a network failure.1
  • DBMSes are also about concurrency - they have to arbiter between multiple clients accessing and potentially modifying the same data.2

As for your specific question of performance, relational data is very "susceptible" to indexing and clustering, which is richly exploited by DBMSes to achieve performance. On top of that, the set-based nature of SQL lets the DBMS choose the optimal way to retrieve the data (in theory at least, some DBMSes are better at that than the others). For more about DBMS performance, I warmly recommend: Use The Index, Luke!

Also, you probably noticed that most DBMSes are rather old products. Like decades old, which is really eons in our industry's terms. One consequence of that is that people had plenty of time to optimize the heck out of the DBMS code base.

You could, in theory, achieve all these things through files, but I suspect you'd end-up with something that looks awfully close to a DBMS (even if you had the time and resources to actually do it). So, why reinvent the wheel (unless you didn't want the wheel in the first place ;) )?


1 Usually though some kind of "journaling" or "transaction log" mechanism. Furthermore, to minimize the probability of "logical" corruption (due to application bugs) and promote code reuse, most DBMSes support declarative constraints (domain, key and referential), triggers and stored procedures.

2 By isolating transactions and even by allowing clients to explicitly lock specific portions of the database.

like image 25
Branko Dimitrijevic Avatar answered Oct 28 '22 07:10

Branko Dimitrijevic


Technically everything is a "file" including folders.. your entire hard drive is giant file. Having said that, yes relational databases, MySQL included store data in a Data file on the hard drive. The difference between a Database and writing/reading to a file is apples and oranges. Databases provide a structured way to store and search/retrieve data in a way you could never replicate by just reading and writing to a file.. Unless you wrote your own db of course..

hope that helps.

like image 25
apesa Avatar answered Oct 28 '22 07:10

apesa


When you store data in a flat file, it is compact and efficient to read sequentially, but there is no fast way to access it randomly. This is especially true of variable-length data such as documents, names or strings. To allow for fast random access, most databases store information in a single file using a data structure called a B-Tree. This structure allows for insert, deletion, and search to be fast, but it can use up to 50% more space than the original file. Typically, however, this is not an issue as disk space is cheap and larger, while the primary tasks usually require fast access. For more information: http://en.wikipedia.org/wiki/B-tree

Looking carefully into the MySQL docs, we find that indices may be optionally set to "BTREE" or "HASH" type. Inside a single MySQL file, multiple indices are stored which may use either data structure.

Although safety and concurrency are important, these are not WHY databases exist, but added features. The very first databases exist because it is not possible to randomly access a sequential file containing variable length data.

like image 45
Rama Hoetzlein Avatar answered Oct 28 '22 08:10

Rama Hoetzlein