From 694a7101d4d041508ff1c30af936aa5a4400f6e4 Mon Sep 17 00:00:00 2001 From: Justin <8886628+justinmir@users.noreply.github.com> Date: Mon, 24 Mar 2025 06:28:20 -0700 Subject: [PATCH] Make MASTERDOWN a retriable error in RedisCluster client (#3164) When clusters are running with `replica-server-stale-data no`, replicas will return a MASTERDOWN error under two conditions: 1. The primary has failed and we are not serving requests. 2. A replica has just started and has not yet synced from the primary. The former, primary has failed and we are not serving requests, is similar to a CLUSTERDOWN error and should be similarly retriable. When a replica has just started and has not yet synced from the primary the request should be retried on other available nodes in the shard. Otherwise a percentage of the read requests to the shard will fail. Examples when `replica-server-stale-data no` is enabled: 1. In a cluster using `ReadOnly` with a single read replica, every read request will return errors to the client because MASTERDOWN is not a retriable error. 2. In a cluster using `RouteRandomly` a percentage of the requests will return errors to the client based on if this server was selected. Co-authored-by: Nedyalko Dyakov --- error.go | 3 +++ 1 file changed, 3 insertions(+) diff --git a/error.go b/error.go index ec2224c0..6f47f7cf 100644 --- a/error.go +++ b/error.go @@ -75,6 +75,9 @@ func shouldRetry(err error, retryTimeout bool) bool { if strings.HasPrefix(s, "READONLY ") { return true } + if strings.HasPrefix(s, "MASTERDOWN ") { + return true + } if strings.HasPrefix(s, "CLUSTERDOWN ") { return true }