Question

I have a VARCHAR(80) utf8mb4 column, and I'm testing adding some ASCII & Emoji characters, and use LENGTH() and CHAR_LENGTH() to understand the differences.

From reading in different places, my understanding is that each character on a utf8mb4 column would take 4 bytes. However, it appears that it may not be the case, if I understand LENGTH() as giving me the actual size that the specific contents are taking on that field.

Is it correct that a row with "aaaa" contents would take less storage than a row with "😁😁😁😁"?

enter image description here

No correct solution

OTHER TIPS

Actually, my answer is in the Documentation: https://mariadb.com/kb/en/unicode/

UTF-8 encoding using one to three bytes per character. Basic Latin letters, numbers and punctuation use one byte. European and Middle East letters mostly fit into 2 bytes. Korean, Chinese, and Japanese ideographs use 3-bytes. No supplementary characters are stored.

utf8mb4 -- Same as utf8, but stores supplementary characters in four bytes.

Also, as per https://mariadb.com/kb/en/data-type-storage-requirements/

String Data Types
In the descriptions below, M is the declared column length (in characters or in bytes), while len is the actual length in bytes of the value.
VARCHAR(M) -- len + 1 bytes if column is 0 – 255 bytes, len + 2 bytes if column may require more than 255 bytes

So, for VARCHAR(80) utf8mb4,
80x4 > 255, so a field with "aaaa" will take 6 bytes (len + 2)

When mysql/mariadb creates columns for UTF8 data they are created big enough to fit any string having given encoding.

VARCHAR(80) COLLATE 'utf8_general_ci' column will be exactly 80x3=240 bytes long.
VARCHAR(80) COLLATE 'utf8mb4_general_ci' column will be exactly 80x4=320 bytes long.

P.S.
InnoDB engine has so called "overflow pages" for table storage where all long data resides. The root page stores only a prefix for indexing and a pointer to the following page. The same is for TEXT and BLOB types.

That is why "row size" for table returned by SHOW TABLE STATUS can be less than max data size table is able to store.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top