Server running community 8.0.14, we hit the number of open files a few days ago, so I increased the limit but after a while the server crashed with this in the error log:

InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
03:07:21 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=221
max_threads=1000
thread_count=4
connection_count=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 403301 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f9200000b20
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fd571ceaca0 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char*, unsigned long)+0x3d) [0x561bdb69757d]
/usr/sbin/mysqld(handle_fatal_signal+0x423) [0x561bda8277c3]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890) [0x7fe5095f6890]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7) [0x7fe507981e97]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141) [0x7fe507983801]
/usr/sbin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x2d5) [0x561bdb94d2c5]
/usr/sbin/mysqld(+0x1fe243c) [0x561bdb94f43c]
/usr/sbin/mysqld(mem_heap_create_block_func(mem_block_info_t*, unsigned long, unsigned long)+0x9ba) [0x561bdb95638a]
/usr/sbin/mysqld(mem_heap_add_block(mem_block_info_t*, unsigned long)+0x72) [0x561bdb956562]
/usr/sbin/mysqld(trx_undo_update_rec_get_update(unsigned char const*, dict_index_t*, unsigned long, unsigned long, unsigned long, unsigned long, trx_t*, mem_block_info_t*, upd_t**, lob::undo_vers_t*, type_cmpl_t&)+0x1926) [0x561bdb924896]
/usr/sbin/mysqld(row_purge_step(que_thr_t*)+0x845) [0x561bdb8b7ad5]
/usr/sbin/mysqld(que_run_threads(que_thr_t*)+0x997) [0x561bdb859597]
/usr/sbin/mysqld(srv_worker_thread()+0x29d) [0x561bdb8e85bd]
/usr/sbin/mysqld(std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run()+0x63) [0x561bdb759883]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbd57f) [0x7fe5083a757f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fe5095eb6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fe507a6488f]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 0
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2019-01-27T03:07:37.554520Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.14) starting as process 56489

InnoDB: Progress in percents: 1InnoDB MEMCACHED: Memcached uses atomic increment
 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1002019-01-27T03:12:07.973920Z 0 [System] [MY-010229] [Server] Starting crash recovery...
2019-01-27T03:12:07.982255Z 0 [System] [MY-010232] [Server] Crash recovery finished.
2019-01-27T03:12:10.581437Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2

The machine is a 2x Xeon E5-2680v4 - 28c/56t - 2.4GHz /3.3GHz with 512GB RAM, configuration file:

server-id = 1
log-bin   = /var/lib/mysql-binlogs/web07
relay-log = web07-relay-bin

binlog_expire_logs_seconds = 259200

open_files_limit = 10000
max_connections = 1000

innodb_flush_method=O_DIRECT
innodb_flush_log_at_trx_commit=0
innodb_log_file_size=38G

innodb_buffer_pool_size=300G
innodb_buffer_pool_instances=64

join_buffer_size=5M
tmp_table_size=1G
max_heap_table_size=512M
skip_name_resolve=1

table_open_cache=20000

sql-mode="ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION,NO_AUTO_VALUE_ON_ZERO"

relay-log-recovery=1

default-authentication-plugin=mysql_native_password

After that crash, the server crashes again if I don't restart it when the open files become too many (every about 24 hours), this is now for example:

$ mysql -e "show global status like 'open_%';"
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 62507 |
| Open_streams             | 0     |
| Open_table_definitions   | 1843  |
| Open_tables              | 18779 |
| Opened_files             | 71902 |
| Opened_table_definitions | 3168  |
| Opened_tables            | 18803 |
+--------------------------+-------+

While I was writing this question, the server crashed:

2019-01-29T22:27:51.031406Z 54298197 [ERROR] [MY-013132] [Server] The table '/tmp/#sql9ae8_33c8655_13' is full!

2019-01-29T22:28:01.844610Z 54321796 [ERROR] [MY-013132] [Server] The table '/tmp/#sql9ae8_33ce284_1' is full!

How can I debug why there are so many open files? The traffic is high but usually there are no more than 10 connections open, so why the temporary tables are not closed?

enter image description here

lsof: (~27k total rows, 10 hours after restart)

COMMAND   PID  USER   FD      TYPE             DEVICE     SIZE/OFF       NODE NAME
mysqld  23964 mysql  cwd       DIR                9,3         4096   49549201 /var/lib/mysql
mysqld  23964 mysql  rtd       DIR                9,3         4096          2 /
mysqld  23964 mysql  txt       REG                9,3     56952368    6292097 /usr/sbin/mysqld
mysqld  23964 mysql  DEL       REG                9,3                12858561 /tmp/mysql_temptable.As3HWG
mysqld  23964 mysql  DEL       REG                9,3                12858557 /tmp/mysql_temptable.I4gP0F
mysqld  23964 mysql  DEL       REG                9,3                12858554 /tmp/mysql_temptable.W139aF
mysqld  23964 mysql  DEL       REG                9,3                12858553 /tmp/mysql_temptable.MhBwlE
mysqld  23964 mysql  DEL       REG                9,3                12858551 /tmp/mysql_temptable.mcw0vD
mysqld  23964 mysql  DEL       REG                9,3                12858550 /tmp/mysql_temptable.OFuZ94
mysqld  23964 mysql  DEL       REG                9,3                12858544 /tmp/mysql_temptable.2ffRvY
mysqld  23964 mysql  DEL       REG                9,3                12858549 /tmp/mysql_temptable.VvXQPw
mysqld  23964 mysql  DEL       REG                9,3                12858541 /tmp/mysql_temptable.KTTZRR
mysqld  23964 mysql  DEL       REG                9,3                12858542 /tmp/mysql_temptable.eMVSbq
mysqld  23964 mysql  DEL       REG                9,3                12858538 /tmp/mysql_temptable.6Kaay

After the first lines, all files are (deleted). I think this is the issue, why these pointers aren't released?

mysqld  27864 mysql 1658u      REG                9,3      1048584   12846342 /tmp/mysql_temptable.wazlKx (deleted)
mysqld  27864 mysql 1660u      REG                9,3      1048584   12846352 /tmp/mysql_temptable.S5Oo1m (deleted)
mysqld  27864 mysql 1662u      REG                9,3      1048584   12846343 /tmp/mysql_temptable.l1CZjK (deleted)
mysqld  27864 mysql 1666u      REG                9,3      1048584   12846366 /tmp/mysql_temptable.VDaTxb (deleted)

2019-02-19: I've increased some limits as suggested, and done a mysqlcheck --all-databases, with no changes.

As written in the comments, the number of Open files is not decreasing even if I stop the traffic.

The application is proven stable and no significant changes have been made in the code or in the amount of traffic in the days before the issue appeared. (I am 100% sure because that day was holiday and there are no commits on the master branch)

有帮助吗?

解决方案

The issue is a Mysql bug (https://bugs.mysql.com/bug.php?id=94185 )

Anyway, some tuning mitigated the issue and stopped the number of open files to grow.

Specifically, we set:

innodb_lru_scan_depth=100
innodb_flushing_avg_loops=5
thread_cache_size=216

Fixed as of the upcoming 8.0.16 release, and here's the changelog entry:

Static thread local variables defined at the wrong scope were not released at thread exit.

其他提示

For starters, consider max_connections=500 rather than 1,000 to reduce your RAM footprint significantly in view of the detail of max_connections_used of 221 when your assertion failure event was documented.

Most configs I see have tmp_table_size=1G max_heap_table_size=512M identical sizes.

许可以下: CC-BY-SA归因
不隶属于 dba.stackexchange
scroll top