MySQL Group replication failover detection while node is recovering
I'm trying to get a solid MySQL group replication load balancing / failover set-up. Currently I use
keepalivedto share one
private IPto connect to the
loadbalancedMySQL GROUP REPLICATION cluster, balanced/failover arranged through
haproxy(tcp checks on port 33061) which works great.
However, once a node get's in an unreachable state (because of network issues) and eventually goes offline for the cluster we will have to join the node back to the cluster, which is all fine and works. However, during the recovery phase (state
RECOVERING), the node is already listening on port
33061, enabling the node for loadbalancing and failover. However, the cluster is not operational yet.
Is there any check I can add to prevent the node from being online while the node is still in
RECOVERINGstate, joining the cluster? Usually this process is rather quick, but it also happend a few times that it may take up to 15 minutes, causing database errors during this phase. Many thanks!
option 1) Could you use
mysqlrouterbetween HAProxy and MySQL (e.g. HAProxy >> Router >> MySQL) as the router should (in theory) see the node is "recovering" and ignore it I believe.
option 2) Alternatively, could you use something like this https://sysbible.org/2008/12/04/having-haproxy-check-mysql-status-through-a-xinetd-script/
Essentially you use
xinetdon the database server to query
mysqland return a yes/no result for availability thus informing HAProxy if a given replica is available for use.
I already do this with my async Source/Replica databases and HAProxy to check for replication delays and number of connections, and have just started exploring Innodb Cluster. These were some of the ideas I had listed for testing when I get a moment.