I read proc/<pid>/io
to measure the IO-activity of SQL-queries, where <pid>
is the PID of the database server. I read the values before and after each query to compute the difference and get the number of bytes the request caused to be read and/or written.
As far as I know the field READ_BYTES
counts actual disk-IO, while RCHAR
includes more, like reads that could be satisfied by the linux page cache (see Understanding the counters in /proc/[pid]/io for clarification).
This leads to the assumption, that RCHAR
should come up with a value equal or greater than READ_BYTES
, but my results contradict this assumption.
I could imagine some minor block or page overhead for results I get for Infobright ICE (values are MB):
Query RCHAR READ_BYTES
tpch_q01.sql| 34.44180| 34.89453|
tpch_q02.sql| 2.89191| 3.64453|
tpch_q03.sql| 32.58994| 33.19531|
tpch_q04.sql| 17.78325| 18.27344|
But I completely fail to understand the IO-counters for MonetDB (values are MB):
Query RCHAR READ_BYTES
tpch_q01.sql| 0.07501| 220.58203|
tpch_q02.sql| 1.37840| 18.16016|
tpch_q03.sql| 0.08272| 162.38281|
tpch_q04.sql| 0.06604| 83.25391|
Am I wrong with the assumption that RCHAR
includes READ_BYTES
? Is there a way to trick out the kernels counters, that MonetDB could use? What is going on here?
I might add, that I clear the page cache and restart the database-server before each query. I'm on Ubuntu 11.10, running kernel 3.0.0-15-generic.
It's sometimes referred to as a process information pseudo-file system. It doesn't contain 'real' files but runtime system information (e.g. system memory, devices mounted, hardware configuration, etc). For this reason it can be regarded as a control and information centre for the kernel.
The proc filesystem is a pseudo-filesystem which provides an interface to kernel data structures. It is commonly mounted at /proc.
The /proc/ directory — also called the proc file system — contains a hierarchy of special files which represent the current state of the kernel — allowing applications and users to peer into the kernel's view of the system.
The /proc/PID/maps file contains the currently mapped memory regions and their access permissions. or if empty, the mapping is anonymous.
I can only think of two things:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/proc.txt;hb=HEAD#l1305
1:
1446 read_bytes
1447 ----------
1448
1449 I/O counter: bytes read
1450 Attempt to count the number of bytes which this process really did cause to
1451 be fetched from the storage layer.
I read "Caused to be fetched from the storage layer" to include readahead, whatever.
2:
1411 rchar
1412 -----
1413
1414 I/O counter: chars read
1415 The number of bytes which this task has caused to be read from storage. This
1416 is simply the sum of bytes which this process passed to read() and pread().
1417 It includes things like tty IO and it is unaffected by whether or not actual
1418 physical disk IO was required (the read might have been satisfied from
1419 pagecache)
Note that this says nothing about "disk access via memory mapped files". I think this is the more likely reason, and that your MonetDB probably mmaps out its database files and then does everything on them.
I'm not really sure how you could check the used bandwidth on mmap, because of its nature.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With