Issue Starting MariaDB as a Slave
-
I have a MariaDB / MySQL database running on one node, no problems (as far as I am aware.) On a second node I have a MariaDB slave that used to be fine but now cannot start. I have not seen these database errors before and I am attempting to debug this to figure out what is wrong.
150615 23:31:33 mysqld_safe Starting mysqld daemon with databases from /db/mysql
150615 23:31:33 mysqld_safe WSREP: Running position recovery with --log_error='/db/mysql/wsrep_recovery.KrXayq' --pid-file='/db/mysql/mysql_slave-app_slave00-recover.pid'
150615 23:31:40 mysqld_safe WSREP: Recovered position f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3896780
150615 23:31:40 [Note] WSREP: wsrep_start_position var submitted: 'f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3896780'
150615 23:31:40 [Note] WSREP: Read nil XID from storage engines, skipping position init
150615 23:31:40 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
150615 23:31:40 [Note] WSREP: wsrep_load(): Galera 25.3.2(r170) by Codership Oy [email protected] loaded successfully.
150615 23:31:40 [Note] WSREP: CRC-32C: using hardware acceleration.
150615 23:31:40 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
150615 23:31:40 [Note] WSREP: Passing config to GCS: base_host = x.x.x.x; base_port = 4567; cert.log_conflicts = no; evs.inactive_check_period = PT3S; evs.inactive_timeout = PT60S; evs.install_timeout = PT60S; evs.keepalive_period = PT1S; evs.suspect_timeout = PT10S; gcache.dir = /db/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /db/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.proto_max = 5
150615 23:31:40 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
150615 23:31:40 [Note] WSREP: wsrep_sst_grab()
150615 23:31:40 [Note] WSREP: Start replication
150615 23:31:40 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
150615 23:31:40 [Note] WSREP: protonet asio version 0
150615 23:31:40 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
150615 23:31:40 [Note] WSREP: backend: asio
150615 23:31:40 [Note] WSREP: GMCast version 0
150615 23:31:40 [Note] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150615 23:31:40 [Note] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150615 23:31:40 [Note] WSREP: EVS version 0
150615 23:31:40 [Note] WSREP: PC version 0
150615 23:31:40 [Note] WSREP: gcomm: connecting to group 'xxxxx', peer 'x.x.x.x:,x.x.x.x:'
150615 23:31:40 [Warning] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' points to own listening address, blacklisting
150615 23:31:40 [Note] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid aca0e458-13b6-11e5-92a0-b2b8427ffb6c is blacklisted, skipping
150615 23:31:41 [Note] WSREP: declaring 1d809986-1390-11e5-868f-2b0449a68232 stable
150615 23:31:41 [Note] WSREP: Node 1d809986-1390-11e5-868f-2b0449a68232 state prim
150615 23:31:41 [Note] WSREP: view(view_id(PRIM,1d809986-1390-11e5-868f-2b0449a68232,34) memb {
1d809986-1390-11e5-868f-2b0449a68232,0
aca0e458-13b6-11e5-92a0-b2b8427ffb6c,0
} joined {
} left {
} partitioned {
})
150615 23:31:41 [Note] WSREP: gcomm: connected
150615 23:31:41 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150615 23:31:41 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150615 23:31:41 [Note] WSREP: Opened channel 'xxxxx'
150615 23:31:41 [Note] WSREP: Waiting for SST to complete.
150615 23:31:41 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
150615 23:31:41 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
150615 23:31:41 [Note] WSREP: STATE EXCHANGE: sent state msg: acee0164-13b6-11e5-bca2-6a93a1c3e829
150615 23:31:41 [Note] WSREP: STATE EXCHANGE: got state msg: acee0164-13b6-11e5-bca2-6a93a1c3e829 from 0 (mysql00)
150615 23:31:41 [Note] WSREP: STATE EXCHANGE: got state msg: acee0164-13b6-11e5-bca2-6a93a1c3e829 from 1 (mysql_slave-app_slave00)
150615 23:31:41 [Note] WSREP: Quorum results:
version = 3,
component = PRIMARY,
conf_id = 33,
members = 1/2 (joined/total),
act_id = 3903514,
last_appl. = -1,
protocols = 0/5/2 (gcs/repl/appl),
group UUID = f373d3de-c577-11e3-9ccf-fe8f0d75d59c
150615 23:31:41 [Note] WSREP: Flow-control interval: [23, 23]
150615 23:31:41 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 3903514)
150615 23:31:41 [Note] WSREP: State transfer required:
Group state: f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3903514
Local state: 00000000-0000-0000-0000-000000000000:-1
150615 23:31:41 [Note] WSREP: New cluster view: global state: f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3903514, view# 34: Primary, number of nodes: 2, my index: 1, protocol version 2
150615 23:31:41 [Warning] WSREP: Gap in state sequence. Need state transfer.
150615 23:31:43 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address 'x.x.x.x' --auth 'no:nono' --datadir '/db/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '16465''
WSREP_SST: [INFO] Streaming with tar (20150615 23:31:43.615)
WSREP_SST: [INFO] Note: Advanced xtrabackup features - encryption,compression etc. not available with tar. (20150615 23:31:43.619)
WSREP_SST: [INFO] However, xbstream requires manual cleanup of data directory before SST - lp:1193240. (20150615 23:31:43.623)
150615 23:31:43 [Note] WSREP: Prepared SST request: xtrabackup|x.x.x.x:4444/xtrabackup_sst
150615 23:31:43 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150615 23:31:43 [Note] WSREP: REPL Protocols: 5 (3, 1)
150615 23:31:43 [Note] WSREP: Assign initial position for certification: 3903514, protocol version: 3
150615 23:31:43 [Note] WSREP: Service thread queue flushed.
150615 23:31:43 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (f373d3de-c577-11e3-9ccf-fe8f0d75d59c): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():445. IST will be unavailable.
150615 23:31:43 [Note] WSREP: Node 1.0 (mysql_slave-app_slave00) requested state transfer from 'any'. Selected 0.0 (mysql00)(SYNCED) as donor.
150615 23:31:43 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 3903514)
150615 23:31:43 [Note] WSREP: Requesting state transfer: success, donor: 0
WSREP_SST: [ERROR] xtrabackup process ended without creating '/db/mysql//xtrabackup_galera_info' (20150615 23:31:45.340)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20150615 23:31:45.340)
150615 23:31:45 [Warning] WSREP: 0.0 (mysql00): State transfer to 1.0 (mysql_slave-app_slave00) failed: -1 (Operation not permitted)
150615 23:31:45 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
150615 23:31:45 [Note] WSREP: gcomm: terminating thread
150615 23:31:45 [Note] WSREP: gcomm: joining thread
150615 23:31:45 [Note] WSREP: gcomm: closing backend
WSREP_SST: [ERROR] Killing nc pid (20150615 23:31:45.363)
WSREP_SST: [INFO] Removing the sst_in_progress file (20150615 23:31:45.369)
150615 23:31:45 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address 'x.x.x.x' --auth 'no:no' --datadir '/db/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '16465': 32 (Broken pipe)
150615 23:31:45 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
150615 23:31:45 [ERROR] WSREP: SST failed: 32 (Broken pipe)
150615 23:31:45 [ERROR] Aborting150615 23:31:46 [Note] WSREP: view(view_id(NON_PRIM,1d809986-1390-11e5-868f-2b0449a68232,34) memb {
aca0e458-13b6-11e5-92a0-b2b8427ffb6c,0
} joined {
} left {
} partitioned {
1d809986-1390-11e5-868f-2b0449a68232,0
})
150615 23:31:46 [Note] WSREP: view((empty))
150615 23:31:46 [Note] WSREP: gcomm: closed
150615 23:31:46 [Note] WSREP: /usr/sbin/mysqld: Terminated.
150615 23:31:46 mysqld_safe mysqld from pid file /db/mysql/mysqld.pid ended -
This appears to be the most relevant section:
WSREP_SST: [ERROR] xtrabackup process ended without creating '/db/mysql//xtrabackup_galera_info' (20150615 23:31:45.340) WSREP_SST: [ERROR] Cleanup after exit with status:32 (20150615 23:31:45.340) 150615 23:31:45 [Warning] WSREP: 0.0 (mysql00): State transfer to 1.0 (mysql_slave-app_slave00) failed: -1 (Operation not permitted) 150615 23:31:45 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
-
Although this could definitely be something:
150615 23:48:22 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (f373d3de-c577-11e3-9ccf-fe8f0d75d59c): 1 (Operation not permitted)
-
Could this be an issue related to Percona XtraBackup?
-
Trying anything I can think of, this part looks promising to maybe be the issue...
WSREP_SST: [INFO] Removing the sst_in_progress file (20150615 23:31:45.369) 150615 23:31:45 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address 'x.x.x.x' --auth 'no:no' --datadir '/db/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '16465': 32 (Broken pipe) 150615 23:31:45 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. 150615 23:31:45 [ERROR] WSREP: SST failed: 32 (Broken pipe) 150615 23:31:45 [ERROR] Aborting