Question

I read proc/<pid>/io to measure the IO-activity of SQL-queries, where <pid> is the PID of the database server. I read the values before and after each query to compute the difference and get the number of bytes the request caused to be read and/or written.

As far as I know the field READ_BYTES counts actual disk-IO, while RCHAR includes more, like reads that could be satisfied by the linux page cache (see Understanding the counters in /proc/[pid]/io for clarification). This leads to the assumption, that RCHAR should come up with a value equal or greater than READ_BYTES, but my results contradict this assumption.

I could imagine some minor block or page overhead for results I get for Infobright ICE (values are MB):

        Query        RCHAR   READ_BYTES
tpch_q01.sql|    34.44180|    34.89453|
tpch_q02.sql|     2.89191|     3.64453|
tpch_q03.sql|    32.58994|    33.19531|
tpch_q04.sql|    17.78325|    18.27344|

But I completely fail to understand the IO-counters for MonetDB (values are MB):

        Query        RCHAR   READ_BYTES
tpch_q01.sql|     0.07501|   220.58203|
tpch_q02.sql|     1.37840|    18.16016|
tpch_q03.sql|     0.08272|   162.38281|
tpch_q04.sql|     0.06604|    83.25391|

Am I wrong with the assumption that RCHAR includes READ_BYTES? Is there a way to trick out the kernels counters, that MonetDB could use? What is going on here?

I might add, that I clear the page cache and restart the database-server before each query. I'm on Ubuntu 11.10, running kernel 3.0.0-15-generic.

Was it helpful?

Solution

I can only think of two things:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/filesystems/proc.txt;hb=HEAD#l1305

1:

1446 read_bytes
1447 ----------
1448
1449 I/O counter: bytes read
1450 Attempt to count the number of bytes which this process really did cause to
1451 be fetched from the storage layer.

I read "Caused to be fetched from the storage layer" to include readahead, whatever.

2:

1411 rchar
1412 -----
1413
1414 I/O counter: chars read
1415 The number of bytes which this task has caused to be read from storage. This
1416 is simply the sum of bytes which this process passed to read() and pread().
1417 It includes things like tty IO and it is unaffected by whether or not actual
1418 physical disk IO was required (the read might have been satisfied from
1419 pagecache)

Note that this says nothing about "disk access via memory mapped files". I think this is the more likely reason, and that your MonetDB probably mmaps out its database files and then does everything on them.

I'm not really sure how you could check the used bandwidth on mmap, because of its nature.

OTHER TIPS

You can also read Linux kernel source code file: /include/linux/task_io_accounting.h

struct task_io_accounting {
#ifdef CONFIG_TASK_XACCT
  /* bytes read */
  u64 rchar;
  /*  bytes written */
  u64 wchar;
  /* # of read syscalls */
  u64 syscr;
  /* # of write syscalls */
  u64 syscw;
#endif /* CONFIG_TASK_XACCT */

#ifdef CONFIG_TASK_IO_ACCOUNTING
  /*
   * The number of bytes which this task has caused to be read from
   * storage.
   */
  u64 read_bytes;

  /*
   * The number of bytes which this task has caused, or shall cause to be
   * written to disk.
   */
  u64 write_bytes;

  /*
   * A task can cause "negative" IO too.  If this task truncates some
   * dirty pagecache, some IO which another task has been accounted for
   * (in its write_bytes) will not be happening.  We _could_ just
   * subtract that from the truncating task's write_bytes, but there is
   * information loss in doing that.
   */
  u64 cancelled_write_bytes;
#endif /* CONFIG_TASK_IO_ACCOUNTING */
};
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top