MySQL Group replication failover detection while node is recovering



  • I'm trying to get a solid MySQL group replication load balancing / failover set-up. Currently I use keepalived to share one private IP to connect to the loadbalanced MySQL GROUP REPLICATION cluster, balanced/failover arranged through haproxy (tcp checks on port 33061) which works great.

    However, once a node get's in an unreachable state (because of network issues) and eventually goes offline for the cluster we will have to join the node back to the cluster, which is all fine and works. However, during the recovery phase (state RECOVERING), the node is already listening on port 33061, enabling the node for loadbalancing and failover. However, the cluster is not operational yet.

    Is there any check I can add to prevent the node from being online while the node is still in RECOVERING state, joining the cluster? Usually this process is rather quick, but it also happend a few times that it may take up to 15 minutes, causing database errors during this phase. Many thanks!



  • option 1) Could you use mysqlrouter between HAProxy and MySQL (e.g. HAProxy >> Router >> MySQL) as the router should (in theory) see the node is "recovering" and ignore it I believe.

    option 2) Alternatively, could you use something like this https://sysbible.org/2008/12/04/having-haproxy-check-mysql-status-through-a-xinetd-script/

    Essentially you use xinetd on the database server to query mysql and return a yes/no result for availability thus informing HAProxy if a given replica is available for use.

    I already do this with my async Source/Replica databases and HAProxy to check for replication delays and number of connections, and have just started exploring Innodb Cluster. These were some of the ideas I had listed for testing when I get a moment.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2