Question

I am trying to estimate the real disk-usage required space for each record of my table in MySQL RDBMS.

The table has a structure like this:

  • ID INT 4 byte;
  • VARCHAR(34) 34 byte;
  • INT 4 byte;
  • INT(5) 4 byte;
  • INT 4 byte;
  • INT 4 byte which is also a FOREIGN KEY;

So there are 5 INT fields and a VARCHAR of a maximum of 34 chars (i.e. 34 bytes).

I have 2 questions:

1) The total should be 54 bytes per record (with variable VARCHAR, of course) am I right when I am saying that, or there are also some over-head bytes which I should consider when estimating the disk-usage space?

2) I have also used INT(5) instead of CHAR(5) cause I need to store only exactly 5 digits in that field (I am going to do that by application, with regExp and string length, cause I know that INT(5) could be more than an int with 5 digits). But could this be considered such as an optimization by the disk-usage space cause I am using an INT (4 bytes) instead of a CHAR(5) which is 5 bytes, i.e. 1 more byte per record?

Thanks for the attention!

Was it helpful?

Solution

One record itself will use

1 byte in offsets

0 bytes in NULLable bits

5 bytes in "extra bytes" header

4 bytes ID

6 bytes transaction id

7 bytes rollback pointer

0-3*34 bytes in VARCHAR(34) (one character may take up to 3 bytes because of UTF8)

4*4 bytes in other integers

Each distinct value of FK will lead to one record in a secondary index. it will use

5 bytes in "extra bytes" header

4 bytes INT for FK value

4 bytes INT for Primary key

Other overhead is page level: 120 bytes per page (16k) in headers page fill factor 15/16 - i.e. one page may contain 15k in records.

And the last - add space used by non-leaf pages, which should be small anyway

So, answer to question - 1) yes there will be some overhead that you can calculate using information above.

2) CHAR(5) in UTF8 will add a byte for its length, so INT looks reasoaanle to use

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top