Replicated requests | goforagents.com

Canonical guidance

hedge only latency-sensitive, safe-to-duplicate work
define what counts as a winning result
cancel the rest immediately

Use when

tail-latency reduction
replicated reads
multi-endpoint lookups where duplicates are acceptable

Avoid

hedging writes with side effects
launching duplicates with no cancellation path
using replication where one slow dependency should instead be fixed or rate-limited

Preferred pattern

launch bounded replicas with a shared context and accept the first sufficient result

Anti-pattern

fan out duplicates and then wait for all of them anyway

Explanation: This is tempting because it simplifies collection logic, but it keeps paying the duplicate cost after the answer is already known.

Why

replicated requests trade extra load for lower tail latency; that trade must stay explicit

Related pages

Sources

Concurrency is not parallelism - Rob Pike
Go Concurrency Patterns: Context - Sameer Ajmani