Question

I recently switched galera synchronization method from rsync to mariabackup which greatly reduced CPU usage on our DB servers and also increased the innodb_log_file_size to 2G from about 600M or so. Since then (might be a coincidence) I started getting a lot of aborted connection errors and bad log write efficiency. I initially thought it had something to do with wait_timeout (which we set to 5 since our application has to be very fast ) and raised it to 60 as well as max_allowed_packet to 2G. But even after that the errors still remained. And AlertManager raises host crash errors almost hourly.

[--] Skipped version check for MySQLTuner script
[OK] Logged in using credentials passed on the command line
[OK] Currently running supported MySQL version 10.4.14-MariaDB-1:10.4.14+maria~focal-log
[OK] Operating on 64-bit architecture
 
-------- Log file Recommendations ------------------------------------------------------------------
[!!] Log file  doesn't exist
 
-------- Storage Engine Statistics -----------------------------------------------------------------
[--] Status: +Aria +CSV +InnoDB +MEMORY +MRG_MyISAM +MyISAM +PERFORMANCE_SCHEMA +SEQUENCE 
[--] Data in InnoDB tables: 16.3G (Tables: 150)
[OK] Total fragmented tables: 0
 
-------- Analysis Performance Metrics --------------------------------------------------------------
[--] innodb_stats_on_metadata: OFF
[OK] No stat updates during querying INFORMATION_SCHEMA.
 
-------- Security Recommendations ------------------------------------------------------------------
[OK] There are no anonymous accounts for any database users
[!!] User 'mariadb.sys@localhost' has no password set.
[!!] User 'developer@%' does not specify hostname restrictions.
[--] There are 620 basic passwords in the list.
 
-------- CVE Security Recommendations --------------------------------------------------------------
[OK] NO SECURITY CVE FOUND FOR YOUR VERSION
 
-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 4h 56m 41s (1M q [64.284 qps], 33K conn, TX: 758M, RX: 166M)
[--] Reads / Writes: 40% / 60%
[--] Binary logging is enabled (GTID MODE: OFF)
[--] Physical Memory     : 54.9G
[--] Max MySQL memory    : 46.1G
[--] Other process memory: 0B
[--] Total buffers: 19.2G global + 26.9M per thread (1000 max threads)
[--] P_S Max memory usage: 623M
[--] Galera GCache Max memory usage: 128M
[OK] Maximum reached memory usage: 20.7G (37.77% of installed RAM)
[OK] Maximum possible memory usage: 46.1G (83.90% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 4% (55K/1M)
[OK] Highest usage of available connections: 3% (36/1000)
[!!] Aborted connections: 10.61%  (3560/33563)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 3 sorts)
[OK] No joins without indexes
[OK] Temporary tables created on disk: 12% (13 on disk / 107 total)
[OK] Thread cache hit rate: 99% (69 created / 33K connections)
[OK] Table cache hit rate: 97% (290 open / 296 opened)
[OK] table_definition_cache(400) is upper than number of tables(313)
[OK] Open file limit used: 0% (58/16K)
[OK] Table locks acquired immediately: 100% (686 immediate / 686 locks)
[OK] Binlog cache memory access: 100.00% (586009 Memory / 586009 Total)
 
-------- Performance schema ------------------------------------------------------------------------
[--] Memory used by P_S: 623.7M
[--] Sys schema isn't installed.
 
-------- ThreadPool Metrics ------------------------------------------------------------------------
[--] ThreadPool stat is enabled.
[--] Thread Pool Size: 14 thread(s).
[--] Using default value is good enough for your version (10.4.14-MariaDB-1:10.4.14+maria~focal-log)
 
-------- MyISAM Metrics ----------------------------------------------------------------------------
[!!] Key buffer used: 18.3% (6M used / 33M cache)
[!!] Cannot calculate MyISAM index size - re-run script as root user
 
-------- InnoDB Metrics ----------------------------------------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 19.0G/16.3G
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 2.0G * 2/19.0G should be equal to 25%
[OK] InnoDB buffer pool instances: 19
[--] Number of InnoDB Buffer Pool Chunk : 152 for 19 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 100.00% (12044117752 hits/ 12044442245 total)
[!!] InnoDB Write Log efficiency: 87.27% (874238 hits/ 1001803 total)
[OK] InnoDB log waits: 0.00% (0 waits / 127565 writes)
 
-------- AriaDB Metrics ----------------------------------------------------------------------------
[--] AriaDB is enabled.
[OK] Aria pagecache size / total Aria indexes: 128.0M/312.0K
[OK] Aria pagecache hit rate: 99.5% (2K cached / 14 reads)
 
-------- TokuDB Metrics ----------------------------------------------------------------------------
[--] TokuDB is disabled.
 
-------- XtraDB Metrics ----------------------------------------------------------------------------
[--] XtraDB is disabled.
 
-------- Galera Metrics ----------------------------------------------------------------------------
[--] Galera is enabled.
[--] GCache is using 0B
[--] CPU core detected  : 14

[--] wsrep_slave_threads: 32
[OK] wsrep_slave_threads is equal to 2, 3 or 4 times number of CPU(s)
[OK] gcs.fc_limit should be equal to 5 * wsrep_slave_threads
[--] wsrep parallel slave can cause frequent inconsistency crash.
[OK] gcs.fc_limit is equal to 5 * wsrep_slave_threads
[OK] gcs.fc_factor is equal to 0.8
[OK] Flow control fraction seems to be OK (wsrep_flow_control_paused<=0.02)
[OK] All tables are InnoDB tables
[OK] Binlog format is in ROW mode.
[OK] InnoDB flush log at each commit is disabled for Galera.
[--] Read consistency mode :OFF
[OK] Galera WsREP is enabled.
[OK] Galera Cluster address is defined: gcomm://10.10.5.151,10.10.5.152,10.10.5.153
[--] There are 3 nodes in wsrep_cluster_address
[OK] There are 3 nodes in wsrep_cluster_size.
[OK] All cluster nodes detected.
[OK] Galera Cluster name is defined: fraudsniper_cluster
[OK] Galera Node name is defined: node1
[!!] Galera Notify command is not defined.
[OK] SST Method is based on xtrabackup.
[OK] TOI is default mode for upgrade.
[--] Max WsRep message : 2.0G
[OK] Node is connected
[OK] Node is ready
[--] Cluster status :Primary
[OK] Galera cluster is consistent and ready for operations
[OK] Node and whole cluster at the same level: d1108e1b-bb16-11ea-b339-c7a9c5da38c9
[OK] Node is synced with whole cluster.
[OK] There is no certification failures detected.
 
-------- Replication Metrics -----------------------------------------------------------------------
[--] Galera Synchronous replication: YES
[--] No replication slave(s) for this server.
[--] Binlog format: ROW
[--] XA support enabled: ON
[--] Semi synchronous replication Master: OFF
[--] Semi synchronous replication Slave: OFF
[--] This is a standalone server
 
-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
    Set up a Secure Password for mariadb.sys@localhost user: SET PASSWORD FOR 'mariadb.sys'@'SpecificDNSorIp' = PASSWORD('secure_password');
    Restrict Host for 'developer'@% to developer@SpecificDNSorIp
    UPDATE mysql.user SET host ='SpecificDNSorIp' WHERE user='developer' AND host ='%'; FLUSH PRIVILEGES;
    MySQL was started within the last 24 hours - recommendations may be inaccurate
    Reduce or eliminate unclosed connections and network issues
    Consider installing Sys schema from https://github.com/mysql/mysql-sys for MySQL
    Consider installing Sys schema from https://github.com/FromDual/mariadb-sys for MariaDB
Variables to adjust:
    Set wsrep_slave_threads to 1 in case of HA_ERR_FOUND_DUPP_KEY crash on slave
    set up parameter wsrep_notify_cmd to be notify

I'm really at a loss on how to solve the issue. I've tried reducing the log size to 1G, doesn't seem to help either. I've also tried using the command:

SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST where time>1 and command<>"Sleep";

To try debugging slow queries but it's always an empty set yet the tuner reports over 5k slow queries (the slow query time is set to 5s).

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top