Issue Starting MariaDB as a Slave
- 
 I have a MariaDB / MySQL database running on one node, no problems (as far as I am aware.) On a second node I have a MariaDB slave that used to be fine but now cannot start. I have not seen these database errors before and I am attempting to debug this to figure out what is wrong. 150615 23:31:33 mysqld_safe Starting mysqld daemon with databases from /db/mysql 
 150615 23:31:33 mysqld_safe WSREP: Running position recovery with --log_error='/db/mysql/wsrep_recovery.KrXayq' --pid-file='/db/mysql/mysql_slave-app_slave00-recover.pid'
 150615 23:31:40 mysqld_safe WSREP: Recovered position f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3896780
 150615 23:31:40 [Note] WSREP: wsrep_start_position var submitted: 'f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3896780'
 150615 23:31:40 [Note] WSREP: Read nil XID from storage engines, skipping position init
 150615 23:31:40 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
 150615 23:31:40 [Note] WSREP: wsrep_load(): Galera 25.3.2(r170) by Codership Oy [email protected] loaded successfully.
 150615 23:31:40 [Note] WSREP: CRC-32C: using hardware acceleration.
 150615 23:31:40 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
 150615 23:31:40 [Note] WSREP: Passing config to GCS: base_host = x.x.x.x; base_port = 4567; cert.log_conflicts = no; evs.inactive_check_period = PT3S; evs.inactive_timeout = PT60S; evs.install_timeout = PT60S; evs.keepalive_period = PT1S; evs.suspect_timeout = PT10S; gcache.dir = /db/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /db/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.proto_max = 5
 150615 23:31:40 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
 150615 23:31:40 [Note] WSREP: wsrep_sst_grab()
 150615 23:31:40 [Note] WSREP: Start replication
 150615 23:31:40 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
 150615 23:31:40 [Note] WSREP: protonet asio version 0
 150615 23:31:40 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
 150615 23:31:40 [Note] WSREP: backend: asio
 150615 23:31:40 [Note] WSREP: GMCast version 0
 150615 23:31:40 [Note] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
 150615 23:31:40 [Note] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
 150615 23:31:40 [Note] WSREP: EVS version 0
 150615 23:31:40 [Note] WSREP: PC version 0
 150615 23:31:40 [Note] WSREP: gcomm: connecting to group 'xxxxx', peer 'x.x.x.x:,x.x.x.x:'
 150615 23:31:40 [Warning] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' points to own listening address, blacklisting
 150615 23:31:40 [Note] WSREP: (aca0e458-13b6-11e5-92a0-b2b8427ffb6c, 'tcp://0.0.0.0:4567') address 'tcp://x.x.x.x:4567' pointing to uuid aca0e458-13b6-11e5-92a0-b2b8427ffb6c is blacklisted, skipping
 150615 23:31:41 [Note] WSREP: declaring 1d809986-1390-11e5-868f-2b0449a68232 stable
 150615 23:31:41 [Note] WSREP: Node 1d809986-1390-11e5-868f-2b0449a68232 state prim
 150615 23:31:41 [Note] WSREP: view(view_id(PRIM,1d809986-1390-11e5-868f-2b0449a68232,34) memb {
 1d809986-1390-11e5-868f-2b0449a68232,0
 aca0e458-13b6-11e5-92a0-b2b8427ffb6c,0
 } joined {
 } left {
 } partitioned {
 })
 150615 23:31:41 [Note] WSREP: gcomm: connected
 150615 23:31:41 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
 150615 23:31:41 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
 150615 23:31:41 [Note] WSREP: Opened channel 'xxxxx'
 150615 23:31:41 [Note] WSREP: Waiting for SST to complete.
 150615 23:31:41 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
 150615 23:31:41 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
 150615 23:31:41 [Note] WSREP: STATE EXCHANGE: sent state msg: acee0164-13b6-11e5-bca2-6a93a1c3e829
 150615 23:31:41 [Note] WSREP: STATE EXCHANGE: got state msg: acee0164-13b6-11e5-bca2-6a93a1c3e829 from 0 (mysql00)
 150615 23:31:41 [Note] WSREP: STATE EXCHANGE: got state msg: acee0164-13b6-11e5-bca2-6a93a1c3e829 from 1 (mysql_slave-app_slave00)
 150615 23:31:41 [Note] WSREP: Quorum results:
 version = 3,
 component = PRIMARY,
 conf_id = 33,
 members = 1/2 (joined/total),
 act_id = 3903514,
 last_appl. = -1,
 protocols = 0/5/2 (gcs/repl/appl),
 group UUID = f373d3de-c577-11e3-9ccf-fe8f0d75d59c
 150615 23:31:41 [Note] WSREP: Flow-control interval: [23, 23]
 150615 23:31:41 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 3903514)
 150615 23:31:41 [Note] WSREP: State transfer required:
 Group state: f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3903514
 Local state: 00000000-0000-0000-0000-000000000000:-1
 150615 23:31:41 [Note] WSREP: New cluster view: global state: f373d3de-c577-11e3-9ccf-fe8f0d75d59c:3903514, view# 34: Primary, number of nodes: 2, my index: 1, protocol version 2
 150615 23:31:41 [Warning] WSREP: Gap in state sequence. Need state transfer.
 150615 23:31:43 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address 'x.x.x.x' --auth 'no:nono' --datadir '/db/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '16465''
 WSREP_SST: [INFO] Streaming with tar (20150615 23:31:43.615)
 WSREP_SST: [INFO] Note: Advanced xtrabackup features - encryption,compression etc. not available with tar. (20150615 23:31:43.619)
 WSREP_SST: [INFO] However, xbstream requires manual cleanup of data directory before SST - lp:1193240. (20150615 23:31:43.623)
 150615 23:31:43 [Note] WSREP: Prepared SST request: xtrabackup|x.x.x.x:4444/xtrabackup_sst
 150615 23:31:43 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
 150615 23:31:43 [Note] WSREP: REPL Protocols: 5 (3, 1)
 150615 23:31:43 [Note] WSREP: Assign initial position for certification: 3903514, protocol version: 3
 150615 23:31:43 [Note] WSREP: Service thread queue flushed.
 150615 23:31:43 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (f373d3de-c577-11e3-9ccf-fe8f0d75d59c): 1 (Operation not permitted)
 at galera/src/replicator_str.cpp:prepare_for_IST():445. IST will be unavailable.
 150615 23:31:43 [Note] WSREP: Node 1.0 (mysql_slave-app_slave00) requested state transfer from 'any'. Selected 0.0 (mysql00)(SYNCED) as donor.
 150615 23:31:43 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 3903514)
 150615 23:31:43 [Note] WSREP: Requesting state transfer: success, donor: 0
 WSREP_SST: [ERROR] xtrabackup process ended without creating '/db/mysql//xtrabackup_galera_info' (20150615 23:31:45.340)
 WSREP_SST: [ERROR] Cleanup after exit with status:32 (20150615 23:31:45.340)
 150615 23:31:45 [Warning] WSREP: 0.0 (mysql00): State transfer to 1.0 (mysql_slave-app_slave00) failed: -1 (Operation not permitted)
 150615 23:31:45 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
 150615 23:31:45 [Note] WSREP: gcomm: terminating thread
 150615 23:31:45 [Note] WSREP: gcomm: joining thread
 150615 23:31:45 [Note] WSREP: gcomm: closing backend
 WSREP_SST: [ERROR] Killing nc pid (20150615 23:31:45.363)
 WSREP_SST: [INFO] Removing the sst_in_progress file (20150615 23:31:45.369)
 150615 23:31:45 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address 'x.x.x.x' --auth 'no:no' --datadir '/db/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '16465': 32 (Broken pipe)
 150615 23:31:45 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
 150615 23:31:45 [ERROR] WSREP: SST failed: 32 (Broken pipe)
 150615 23:31:45 [ERROR] Aborting150615 23:31:46 [Note] WSREP: view(view_id(NON_PRIM,1d809986-1390-11e5-868f-2b0449a68232,34) memb { 
 aca0e458-13b6-11e5-92a0-b2b8427ffb6c,0
 } joined {
 } left {
 } partitioned {
 1d809986-1390-11e5-868f-2b0449a68232,0
 })
 150615 23:31:46 [Note] WSREP: view((empty))
 150615 23:31:46 [Note] WSREP: gcomm: closed
 150615 23:31:46 [Note] WSREP: /usr/sbin/mysqld: Terminated.
 150615 23:31:46 mysqld_safe mysqld from pid file /db/mysql/mysqld.pid ended
- 
 This appears to be the most relevant section: WSREP_SST: [ERROR] xtrabackup process ended without creating '/db/mysql//xtrabackup_galera_info' (20150615 23:31:45.340) WSREP_SST: [ERROR] Cleanup after exit with status:32 (20150615 23:31:45.340) 150615 23:31:45 [Warning] WSREP: 0.0 (mysql00): State transfer to 1.0 (mysql_slave-app_slave00) failed: -1 (Operation not permitted) 150615 23:31:45 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():723: Will never receive state. Need to abort.
- 
 Although this could definitely be something: 150615 23:48:22 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (f373d3de-c577-11e3-9ccf-fe8f0d75d59c): 1 (Operation not permitted)
- 
 Could this be an issue related to Percona XtraBackup? 
- 
 Trying anything I can think of, this part looks promising to maybe be the issue... WSREP_SST: [INFO] Removing the sst_in_progress file (20150615 23:31:45.369) 150615 23:31:45 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address 'x.x.x.x' --auth 'no:no' --datadir '/db/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '16465': 32 (Broken pipe) 150615 23:31:45 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. 150615 23:31:45 [ERROR] WSREP: SST failed: 32 (Broken pipe) 150615 23:31:45 [ERROR] Aborting
