* fix(pool): wip, pool reauth should not interfere with handoff
* fix credListeners map
* fix race in tests
* better conn usable timeout
* add design decision comment
* few small improvements
* update marked as queued
* add Used to clarify the state of the conn
* rename test
* fix(test): fix flaky test
* lock inside the listeners collection
* address pr comments
* Update internal/auth/cred_listeners.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update internal/pool/buffer_size_test.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* wip refactor entraid
* fix maintnotif pool hook
* fix mocks
* fix nil listener
* sync and async reauth based on conn lifecycle
* be able to reject connection OnGet
* pass hooks so the tests can observe reauth
* give some time for the background to execute commands
* fix tests
* only async reauth
* Update internal/pool/pool.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update internal/auth/streaming/pool_hook.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update internal/pool/conn.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* chore(redisotel): use metric.WithAttributeSet to avoid copy (#3552)
In order to improve performance replace `WithAttributes` with `WithAttributeSet`.
This avoids the slice allocation and copy that is done in `WithAttributes`.
For more information see https://github.com/open-telemetry/opentelemetry-go/blob/v1.38.0/metric/instrument.go#L357-L376
* chore(docs): explain why MaxRetries is disabled for ClusterClient (#3551)
Co-authored-by: Nedyalko Dyakov <1547186+ndyakov@users.noreply.github.com>
* exponential backoff
* address pr comments
* address pr comments
* remove rlock
* add some comments
* add comments
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Warnar Boekkooi <wboekkooi@impossiblecloud.com>
Co-authored-by: Justin <justindsouza80@gmail.com>
* fix(txpipeline): should return error on multi/exec on multiple slots
* fix(txpipeline): test normal tx pipeline behaviour
* chore(err): Extract crossslot err and add test
* fix(txpipeline): short curcuit the tx if there are no commands
* chore(tests): validate keys are in different slots
When clusters are running with `replica-server-stale-data no`, replicas
will return a MASTERDOWN error under two conditions:
1. The primary has failed and we are not serving requests.
2. A replica has just started and has not yet synced from the primary.
The former, primary has failed and we are not serving requests, is
similar to a CLUSTERDOWN error and should be similarly retriable.
When a replica has just started and has not yet synced from the primary
the request should be retried on other available nodes in the shard.
Otherwise a percentage of the read requests to the shard will fail.
Examples when `replica-server-stale-data no` is enabled:
1. In a cluster using `ReadOnly` with a single read replica, every
read request will return errors to the client because MASTERDOWN is
not a retriable error.
2. In a cluster using `RouteRandomly` a percentage of the requests
will return errors to the client based on if this server was
selected.
Co-authored-by: Nedyalko Dyakov <nedyalko.dyakov@gmail.com>
* fix: recycle connections in some Redis Cluster scenarios
This issue was surfaced in a Cloud Provider solution that used for
rolling out new nodes using the same address (hostname) of the nodes
that will be replaced in a Redis Cluster, while the former ones once
depromoted as Slaves would continue in service during some mintues
for redirecting traffic.
The solution basically identifies when the connection could be stale
since a MOVED response will be returned using the same address (hostname)
that is being used by the connection. At that moment we consider the
connection as no longer usable forcing to recycle the connection.